Deep learning
Collaborative Deep Learning and Information Fusion of Heterogeneous Latent Variable Models for Industrial Quality Prediction
IEEE Trans Cybern. 2025 Feb 21;PP. doi: 10.1109/TCYB.2025.3537809. Online ahead of print.
ABSTRACT
In the past years, latent variable models have played an important role in various industrial AI systems, among which quality prediction is one of the most representative applications. Inspired by the idea of deep learning, those basic latent variable models have been extended to deep forms, based on which the quality prediction performance has been significantly improved. However, different latent variable models have their own strengths and weaknesses, a model works well under one scenario might not provide satisfactory performance under another. The motivation of this article is based on the viewpoint of information fusion and ensemble learning for heterogeneous latent variable models. Particularly, a collaborative deep learning and model fusion framework is formulated for the purpose of industrial quality prediction. In the first stage of the framework, collaborative layer-by-layer feature extractions are implemented among different latent variable models, through which different patterns of latent variables are identified in different layers of the deep model. Then, in the second stage, an ensemble regression modeling strategy is proposed to fuse the quality prediction results from different latent variable models, which is based on a well-designed data description method. Two real industrial examples are used for performance evaluation of the proposed method, based on which we can observe that information fusions in terms of both collaborative layer-by-layer feature extraction and heterogeneous model ensemble have positive effects in improving prediction accuracy and stability.
PMID:40036535 | DOI:10.1109/TCYB.2025.3537809
Co-Training Broad Siamese-Like Network for Coupled-View Semi-Supervised Learning
IEEE Trans Cybern. 2025 Feb 21;PP. doi: 10.1109/TCYB.2025.3531441. Online ahead of print.
ABSTRACT
Multiview semi-supervised learning is a popular research area in which people utilize cross-view knowledge to overcome the limitation of labeled data in semi-supervised learning. Existing methods mainly utilize deep neural network, which is relatively time-consuming due to the complex network structure and back propagation iterations. In this article, co-training broad Siamese-like network (Co-BSLN) is proposed for coupled-view semi-supervised classification. Co-BSLN learns knowledge from two-view data and can be used for multiview data with the help of feature concatenation. Different from existing deep learning methods, Co-BSLN utilizes a simple shallow network based on broad learning system (BLS) to simplify the network structure and reduce training time. It replaces back propagation iterations with a direct pseudo inverse calculation to further reduce time consumption. In Co-BSLN, different views of the same instance are considered as positive pairs due to cross-view consistency. Predictions of views in positive pairs are used to guide the training of each other through a direct logit vector mapping. Such a design is fast and effectively utilizes cross-view consistency to improve the accuracy of semi-supervised learning. Evaluation results demonstrate that Co-BSLN is able to improve accuracy and reduce training time on popular datasets.
PMID:40036533 | DOI:10.1109/TCYB.2025.3531441
NciaNet: A Non-Covalent Interaction-Aware Graph Neural Network for the Prediction of Protein-Ligand Interaction in Drug Discovery
IEEE J Biomed Health Inform. 2025 Mar 4;PP. doi: 10.1109/JBHI.2025.3547741. Online ahead of print.
ABSTRACT
Precise quantification of protein-ligand interaction is critical in early-stage drug discovery. Artificial intelligence (AI) has gained massive popularity in this area, with deep-learning models used to extract features from ligand and protein molecules. However, these models often fail to capture intermolecular non-covalent interactions, the primary factor influencing binding, leading to lower accuracy and interpretability. Moreover, such models overlook the spatial structure of protein-ligand complexes, resulting in weaker generalization. To address these issues, we propose Non-covalent Interaction-aware Graph Neural Network (NciaNet), a novel method that effectively utilizes intermolecular non-covalent interactions and 3D protein-ligand structure. Our approach achieves excellent predictive performance on multiple benchmark datasets and outperforms competitive baseline models in the binding affinity task, with the benchmark core set v.2016 achieving an RMSE of 1.208 and an R of 0.833, and the core set v.2013 achieving an RMSE of 1.409 and an R of 0.805, under the high-quality refined v.2016 training conditions. Importantly, NciaNet successfully learns vital features related to protein-ligand interactions, providing biochemical insights and demonstrating practical utility and reliability. However, despite these strengths, there may still be limitations in generalizability to unseen protein-ligand complexes, suggesting potential avenues for future work.
PMID:40036511 | DOI:10.1109/JBHI.2025.3547741
An AI-Based Clinical Decision Support System for Antibiotic Therapy in Sepsis (KINBIOTICS): Use Case Analysis
JMIR Hum Factors. 2025 Mar 4;12:e66699. doi: 10.2196/66699.
ABSTRACT
BACKGROUND: Antimicrobial resistances pose significant challenges in health care systems. Clinical decision support systems (CDSSs) represent a potential strategy for promoting a more targeted and guideline-based use of antibiotics. The integration of artificial intelligence (AI) into these systems has the potential to support physicians in selecting the most effective drug therapy for a given patient.
OBJECTIVE: This study aimed to analyze the feasibility of an AI-based CDSS pilot version for antibiotic therapy in sepsis patients and identify facilitating and inhibiting conditions for its implementation in intensive care medicine.
METHODS: The evaluation was conducted in 2 steps, using a qualitative methodology. Initially, expert interviews were conducted, in which intensive care physicians were asked to assess the AI-based recommendations for antibiotic therapy in terms of plausibility, layout, and design. Subsequently, focus group interviews were conducted to examine the technology acceptance of the AI-based CDSS. The interviews were anonymized and evaluated using content analysis.
RESULTS: In terms of the feasibility, barriers included variability in previous antibiotic administration practices, which affected the predictive ability of AI recommendations, and the increased effort required to justify deviations from these recommendations. Physicians' confidence in accepting or rejecting recommendations depended on their level of professional experience. The ability to re-evaluate CDSS recommendations and an intuitive, user-friendly system design were identified as factors that enhanced acceptance and usability. Overall, barriers included low levels of digitization in clinical practice, limited availability of cross-sectoral data, and negative previous experiences with CDSSs. Conversely, facilitators to CDSS implementation were potential time savings, physicians' openness to adopting new technologies, and positive previous experiences.
CONCLUSIONS: Early integration of users is beneficial for both the identification of relevant context factors and the further development of an effective CDSS. Overall, the potential of AI-based CDSSs is offset by inhibiting contextual conditions that impede its acceptance and implementation. The advancement of AI-based CDSSs and the mitigation of these inhibiting conditions are crucial for the realization of its full potential.
PMID:40036494 | DOI:10.2196/66699
Cone-beam computed tomography (CBCT) image-quality improvement using a denoising diffusion probabilistic model conditioned by pseudo-CBCT of pelvic regions
Radiol Phys Technol. 2025 Mar 4. doi: 10.1007/s12194-025-00892-4. Online ahead of print.
ABSTRACT
Cone-beam computed tomography (CBCT) is widely used in radiotherapy to image patient configuration before treatment but its image quality is lower than planning CT due to scattering, motion, and reconstruction methods. This reduces the accuracy of Hounsfield units (HU) and limits its use in adaptive radiation therapy (ART). However, synthetic CT (sCT) generation using deep learning methods for CBCT intensity correction faces challenges due to deformation. To address these issues, we propose enhancing CBCT quality using a conditional denoising diffusion probability model (CDDPM), which is trained on pseudo-CBCT created by adding pseudo-scatter to planning CT. The CDDPM transforms CBCT into high-quality sCT, improving HU accuracy while preserving anatomical configuration. The performance evaluation of the proposed sCT showed a reduction in mean absolute error (MAE) from 81.19 HU for CBCT to 24.89 HU for the sCT. Peak signal-to-noise ratio (PSNR) improved from 31.20 dB for CBCT to 33.81 dB for the sCT. The Dice and Jaccard coefficients between CBCT and sCT for the colon, prostate, and bladder ranged from 0.69 to 0.91. When compared to other deep learning models, the proposed sCT outperformed them in terms of accuracy and anatomical preservation. The dosimetry analysis for prostate cancer revealed a dose error of over 10% with CBCT but nearly 0% with the sCT. Gamma pass rates for the proposed sCT exceeded 90% for all dose criteria, indicating high agreement with CT-based dose distributions. These results show that the proposed sCT improves image quality, dosimetry accuracy, and treatment planning, advancing ART for pelvic cancer.
PMID:40035984 | DOI:10.1007/s12194-025-00892-4
Application of TransUnet Deep Learning Model for Automatic Segmentation of Cervical Cancer in Small-Field T2WI Images
J Imaging Inform Med. 2025 Mar 4. doi: 10.1007/s10278-025-01464-z. Online ahead of print.
ABSTRACT
Effective segmentation of cervical cancer tissue from magnetic resonance (MR) images is crucial for automatic detection, staging, and treatment planning of cervical cancer. This study develops an innovative deep learning model to enhance the automatic segmentation of cervical cancer lesions. We obtained 4063 T2WI small-field sagittal, coronal, and oblique axial images from 222 patients with pathologically confirmed cervical cancer. Using this dataset, we employed a convolutional neural network (CNN) along with TransUnet models for segmentation training and evaluation of cervical cancer tissues. In this approach, CNNs are leveraged to extract local information from MR images, whereas Transformers capture long-range dependencies related to shape and structural information, which are critical for precise segmentation. Furthermore, we developed three distinct segmentation models based on coronal, axial, and sagittal T2WI within a small field of view using multidirectional MRI techniques. The dice similarity coefficient (DSC) and mean Hausdorff distance (AHD) were used to assess the performance of the models in terms of segmentation accuracy. The average DSC and AHD values obtained using the TransUnet model were 0.7628 and 0.8687, respectively, surpassing those obtained using the U-Net model by margins of 0.0033 and 0.3479, respectively. The proposed TransUnet segmentation model significantly enhances the accuracy of cervical cancer tissue delineation compared to alternative models, demonstrating superior performance in overall segmentation efficacy. This methodology can improve clinical diagnostic efficiency as an automated image analysis tool tailored for cervical cancer diagnosis.
PMID:40035972 | DOI:10.1007/s10278-025-01464-z
An Efficient Approach for Detection of Various Epileptic Waves Having Diverse Forms in Long Term EEG Based on Deep Learning
Brain Topogr. 2025 Mar 4;38(3):35. doi: 10.1007/s10548-025-01111-4.
ABSTRACT
EEG is the most powerful tool for epilepsy discharge detection in brain. Visual evaluation is hard in long term monitoring EEG data as huge amount of data needs to be inspected. Considering the fast and efficient results from deep learning networks especially convolutional networks, and its capability for detection of complex epileptic wave forms, inspired us to evaluate YOLO network for spike detection solution.The most used versions of YOLO (V3, V4 and V7) were evaluated for various epileptic signals. The epileptic discharge wave-forms were first labeled to 9 different signal types, but classified to four group combinations based on their features. EEG data from 20 patients were used under guidance of expert epileptologist. The YOLO networks were all trained for four various class-grouping strategies. The most suitable network to recommend was found to be YOLO-V4, for all four classifying methods giving average sensitivity, specificity, and accuracy of 96.7, 94.3, and 92.8, respectively. YOLO networks have shown promising results in detection of epileptic signals, which by adding some extra measurements this can become a great assistant tool for epileptologists. In addition, besides YOLO's High speed and accuracy in detection of epileptic signals in EEG, it can classify these signals to different morphologies.
PMID:40035961 | DOI:10.1007/s10548-025-01111-4
Accelerated retinal ageing and multimorbidity in middle-aged and older adults
Geroscience. 2025 Mar 4. doi: 10.1007/s11357-025-01581-1. Online ahead of print.
ABSTRACT
The aim of this study is to investigate the association between retinal age gap and multimorbidity. Retinal age gap was calculated based on a previously developed deep learning model for 45,436 participants. The number of age-related conditions reported at baseline was summed and categorized as zero, one, or at least two conditions at baseline (multimorbidity). Incident multimorbidity was defined as having two or more age-related diseases onset during the follow-up period. Linear regressions were fit to examine the associations of disease numbers at baseline with retinal age gaps. Cox proportional hazard regression models were used to examine associations of retinal age gaps with the incidence of multimorbidity. In the fully adjusted model, those with multimorbidity and one disease both showed significant increases in retinal age gaps at baseline compared to participants with zero disease number (β = 0.254, 95% CI 0.154, 0.354; P < 0.001; β = 0.203, 95% CI 0.116, 0.291; P < 0.001; respectively). After a median follow-up period of 11.38 (IQR, 11.26-11.53; range, 0.02-11.81) years, a total of 3607 (17.29%) participants had incident multimorbidity. Each 5-year increase in retinal age gap at baseline was independently associated with an 8% increase in the risk of multimorbidity (HR = 1.08, 95% CI 1.02, 1.14, P = 0.008). Our study demonstrated that an increase of retinal age gap was independently associated with a greater risk of incident multimorbidity. By recognizing deviations from normal aging, we can identify individuals at higher risk of developing multimorbidity. This early identification facilitates patients' self-management and personalized interventions before disease onset.
PMID:40035945 | DOI:10.1007/s11357-025-01581-1
New AI explained and validated deep learning approaches to accurately predict diabetes
Med Biol Eng Comput. 2025 Mar 4. doi: 10.1007/s11517-025-03338-6. Online ahead of print.
ABSTRACT
Diabetes is a metabolic condition that can lead to chronic illness and organ failure if it remains untreated. Accurate detection is essential to reduce these risks at an early stage. Recent advancements in predictive models show promising results. However, these models exhibit inadequate accuracy, struggle with class imbalance, and lack interpretability of the decision-making process. To overcome these issues, we propose two novel deep models for early and accurate diabetes prediction: LeDNet (inspired by LeNet and the Dual Attention Network) and HiDenNet (influenced by the highway network and DenseNet). The models are trained using the Diabetes Health Indicators dataset, which has an inherent class imbalance problem and results in biased predictions. This imbalance is mitigated by employing the majority-weighted minority over-sampling technique. Experimental findings demonstrate that LeDNet achieves an F1-score of 85%, recall of 84%, accuracy of 85%, and precision of 86%. Similarly, HiDenNet achieves accuracy, F1-score, recall, and precision of 85%, 86%, 86%, and 86%, respectively. Both proposed models outperform the state-of-the-art deep learning (DL) models. K-fold cross-validation is applied to ensure models' stability at different data splits. Local interpretable model-agnostic explanations and Shapley additive explanations techniques are utilized to enhance interpretability and overcome the traditional black-box nature of DL models. By providing both local and global insights into feature contributions, these explainable artificial intelligence techniques provide transparency to LeDNet and HiDenNet in diabetes prediction. LeDNet and HiDenNet not only improve decision-making transparency but also enhance diabetes prediction accuracy, making them reliable tools for clinical decision-making and early diagnosis.
PMID:40035798 | DOI:10.1007/s11517-025-03338-6
A novel deep learning framework for automatic scoring of PD-L1 expression in non-small cell lung cancer
Biomol Biomed. 2025 Mar 3. doi: 10.17305/bb.2025.12056. Online ahead of print.
ABSTRACT
A critical predictive marker for anti-PD-1/PD-L1 therapy is programmed death-ligand 1 (PD-L1) expression, assessed by immunohistochemistry (IHC). This paper explores a novel automated framework using deep learning to accurately evaluate PD-L1 expression from whole slide images (WSIs) of non-small cell lung cancer (NSCLC), aiming to improve the precision and consistency of Tumor Proportion Score (TPS) evaluation, which is essential for determining patient eligibility for immunotherapy. Automating TPS evaluation can enhance accuracy and consistency while reducing pathologists' workload. The proposed automated framework encompasses three stages: identifying tumor patches, segmenting tumor areas, and detecting cell nuclei within these areas, followed by estimating the TPS based on the ratio of positively stained to total viable tumor cells. This study utilized a Reference Medicine (Phoenix, Arizona) dataset containing 66 NSCLC tissue samples, adopting a hybrid human-machine approach for annotating extensive WSIs. Patches of size 1000x1000 pixels were generated to train classification models such as EfficientNet, Inception, and Vision Transformer models. Additionally, segmentation performance was evaluated across various UNet and DeepLabV3 architectures, and the pre-trained StarDist model was employed for nuclei detection, replacing traditional watershed techniques. PD-L1 expression was categorized into three levels based on TPS: negative expression (TPS < 1%), low expression (TPS 1-49%), and high expression (TPS ≥ 50%). The Vision Transformer-based model excelled in classification, achieving an F1-score of 97.54%, while the modified DeepLabV3+ model led in segmentation, attaining a Dice Similarity Coefficient of 83.47%. The TPS predicted by the framework closely correlated with the pathologist's TPS at 0.9635, and the framework's three-level classification F1-score was 93.89%. The proposed deep learning framework for automatically evaluating the TPS of PD-L1 expression in NSCLC demonstrated promising performance. This framework presents a potential tool that could produce clinically significant results more efficiently and cost-effectively.
PMID:40035693 | DOI:10.17305/bb.2025.12056
Two-Stage Deep Learning Model for Adrenal Nodule Detection on CT Images: A Retrospective Study
Radiology. 2025 Mar;314(3):e231650. doi: 10.1148/radiol.231650.
ABSTRACT
Background The detection and classification of adrenal nodules are crucial for their management. Purpose To develop and test a deep learning model to automatically depict adrenal nodules on abdominal CT images and to simulate triaging performance in combination with human interpretation. Materials and Methods This retrospective study (January 2000-December 2020) used an internal dataset enriched with adrenal nodules for model training and testing and an external dataset reflecting real-world practice for further simulated testing in combination with human interpretation. The deep learning model had a two-stage architecture, a sequential detection and segmentation model, trained separately for the right and left adrenal glands. Model performance was evaluated using the area under the receiver operating characteristic curve (AUC) for nodule detection and intersection over union for nodule segmentation. Results Of a total of 995 patients in the internal dataset, the AUCs for detecting right and left adrenal nodules in internal test set 1 (n = 153) were 0.98 (95% CI: 0.96, 1.00; P < .001) and 0.93 (95% CI: 0.87, 0.98; P < .001), respectively. These values were 0.98 (95% CI: 0.97, 0.99; P < .001) and 0.97 (95% CI: 0.96, 0.97; P < .001) in the external test set (n = 12 080) and 0.90 (95% CI: 0.84, 0.95; P < .001) and 0.89 (95% CI: 0.85, 0.94; P < .001) in internal test set 2 (n = 1214). The median intersection over union was 0.64 (IQR, 0.43-0.71) and 0.53 (IQR, 0.40-0.64) for right and left adrenal nodules, respectively. Combining the model with human interpretation achieved high sensitivity (up to 100%) and specificity (up to 99%), with triaging performance from 0.77 to 0.98. Conclusion The deep learning model demonstrated high performance and has the potential to improve detection of incidental adrenal nodules. © RSNA, 2025 Supplemental material is available for this article. See also the editorial by Malayeri and Turkbey in this issue.
PMID:40035671 | DOI:10.1148/radiol.231650
Unveiling the Future: A Deep Learning Model for Accurate Detection of Adrenal Nodules
Radiology. 2025 Mar;314(3):e250387. doi: 10.1148/radiol.250387.
NO ABSTRACT
PMID:40035670 | DOI:10.1148/radiol.250387
Hybrid ladybug Hawk optimization-enabled deep learning for multimodal Parkinson's disease classification using voice signals and hand-drawn images
Network. 2025 Mar 4:1-43. doi: 10.1080/0954898X.2025.2457955. Online ahead of print.
ABSTRACT
PD is a progressive neurodegenerative disorder that leads to gradual motor impairments. Early detection is critical for slowing the disease's progression and providing patients access to timely therapies. However, accurately detecting PD in its early stages remains challenging. This study aims to develop an optimized deep learning model for PD classification using voice signals and hand-drawn spiral images, leveraging a ZFNet-LHO-DRN. The proposed model first preprocesses the input voice signal using a Gaussian filter to remove noise. Features are then extracted from the preprocessed signal and passed to ZFNet to generate output-1. For the hand-drawn spiral image, preprocessing is performed with a bilateral filter, followed by image augmentation. Here also, the features are extracted and forwarded to DRN to form output-2. Both classifiers are trained using the LHO algorithm. Finally, from the output-1 and output-2, the best one is selected based on the majority voting. The ZFNet-LHO-DRN model demonstrated excellent performance by achieving a premium accuracy of 89.8%, a NPV of 89.7%, a PPV of 89.7%, a TNR of 89.3%, and a TPR of 90.1%. The model's high accuracy and performance indicate its potential as a valuable tool for assisting in the early diagnosis of PD.
PMID:40035544 | DOI:10.1080/0954898X.2025.2457955
Artificial intelligence in the detection and treatment of depressive disorders: a narrative review of literature
Int Rev Psychiatry. 2025 Feb;37(1):39-51. doi: 10.1080/09540261.2024.2384727. Epub 2024 Jul 30.
ABSTRACT
Modern psychiatry aims to adopt precision models and promote personalized treatment within mental health care. However, the complexity of factors underpinning mental disorders and the variety of expressions of clinical conditions make this task arduous for clinicians. Globally, major depression is a common mental disorder and encompasses a constellation of clinical manifestations and a variety of etiological factors. In this context, the use of Artificial Intelligence might help clinicians in the screening and diagnosis of depression on a wider scale and could also facilitate their task in predicting disease outcomes by considering complex interactions between prodromal and clinical symptoms, neuroimaging data, genetics, or biomarkers. In this narrative review, we report on the most significant evidence from current international literature regarding the use of Artificial Intelligence in the diagnosis and treatment of major depression, specifically focusing on the use of Natural Language Processing, Chatbots, Machine Learning, and Deep Learning.
PMID:40035375 | DOI:10.1080/09540261.2024.2384727
Overview of artificial intelligence in hand surgery
J Hand Surg Eur Vol. 2025 Mar 4:17531934251322723. doi: 10.1177/17531934251322723. Online ahead of print.
ABSTRACT
Artificial intelligence has evolved significantly since its inception, becoming a powerful tool in medicine. This paper provides an overview of the core principles, applications and future directions of artificial intelligence in hand surgery. Artificial intelligence has shown promise in improving diagnostic accuracy, predicting outcomes and assisting in patient education. However, despite its potential, its application in hand surgery is still nascent, with most studies being retrospective and limited by small sample sizes. To harness the full potential of artificial intelligence in hand surgery and support broader adoption, more robust, large-scale studies are needed. Collaboration among researchers, through data sharing and federated learning, is essential for advancing artificial intelligence from experimental to clinically validated tools, ultimately enhancing patient care and clinical workflows.
PMID:40035151 | DOI:10.1177/17531934251322723
The application of artificial intelligence in insomnia, anxiety, and depression: A bibliometric analysis
Digit Health. 2025 Mar 2;11:20552076251324456. doi: 10.1177/20552076251324456. eCollection 2025 Jan-Dec.
ABSTRACT
BACKGROUND: Mental health issues like insomnia, anxiety, and depression have increased significantly. Artificial intelligence (AI) has shown promise in diagnosing and providing personalized treatment.
OBJECTIVE: This study aims to systematically review the application of AI in addressing insomnia, anxiety, and depression, identifying key research hotspots, and forecasting future trends through bibliometric analysis.
METHODS: We analyzed a total of 875 articles from the Web of Science Core Collection (2000-2024) using bibliometric tools such as VOSviewer and CiteSpace. These tools were used to map research trends, highlight international collaboration, and examine the contributions of leading countries, institutions, and authors in the field.
RESULTS: The United States and China lead the field in terms of research output and collaborations. Key research areas include "neural networks," "machine learning," "deep learning," and "human-robot interaction," particularly in relation to personalized treatment approaches. However, challenges around data privacy, ethical concerns, and the interpretability of AI models need to be addressed.
CONCLUSIONS: This study highlights the growing role of AI in mental health research and identifies future priorities, such as improving data quality, addressing ethical challenges, and integrating AI more seamlessly into clinical practice. These advancements will be crucial in addressing the global mental health crisis.
PMID:40035038 | PMC:PMC11873874 | DOI:10.1177/20552076251324456
Evaluating the Quality and Readability of Generative Artificial Intelligence (AI) Chatbot Responses in the Management of Achilles Tendon Rupture
Cureus. 2025 Jan 31;17(1):e78313. doi: 10.7759/cureus.78313. eCollection 2025 Jan.
ABSTRACT
INTRODUCTION: The rise of artificial intelligence (AI), including generative chatbots like ChatGPT (OpenAI, San Francisco, CA, USA), has revolutionized many fields, including healthcare. Patients have gained the ability to prompt chatbots to generate purportedly accurate and individualized healthcare content. This study analyzed the readability and quality of answers to Achilles tendon rupture questions from six generative AI chatbots to evaluate and distinguish their potential as patient education resources.
METHODS: The six AI models used were ChatGPT 3.5, ChatGPT 4, Gemini 1.0 (previously Bard; Google, Mountain View, CA, USA), Gemini 1.5 Pro, Claude (Anthropic, San Francisco, CA, USA) and Grok (xAI, Palo Alto, CA, USA) without prior prompting. Each was asked 10 common patient questions about Achilles tendon rupture, determined by five orthopaedic surgeons. The readability of generative responses was measured using Flesch-Kincaid Reading Grade Level, Gunning Fog, and SMOG (Simple Measure of Gobbledygook). The response quality was subsequently graded using the DISCERN criteria by five blinded orthopaedic surgeons.
RESULTS: Gemini 1.0 generated statistically significant differences in ease of readability (closest to average American reading level) than responses from ChatGPT 3.5, ChatGPT 4, and Claude. Additionally, mean DISCERN scores demonstrated significantly higher quality of responses from Gemini 1.0 (63.0±5.1) and ChatGPT 4 (63.8±6.2) than ChatGPT 3.5 (53.8±3.8), Claude (55.0±3.8), and Grok (54.2±4.8). However, the overall quality (question 16, DISCERN) of each model was averaged and graded at an above-average level (range, 3.4-4.4).
DISCUSSION AND CONCLUSION: Our results indicate that generative chatbots can potentially serve as patient education resources alongside physicians. Although some models lacked sufficient content, each performed above average in overall quality. With the lowest readability and highest DISCERN scores, Gemini 1.0 outperformed ChatGPT, Claude, and Grok and potentially emerged as the simplest and most reliable generative chatbot regarding management of Achilles tendon rupture.
PMID:40034889 | PMC:PMC11872741 | DOI:10.7759/cureus.78313
Cardiotocography-Based Experimental Comparison of Artificial Intelligence and Human Judgment in Assessing Fetal Asphyxia During Delivery
Cureus. 2025 Jan 31;17(1):e78282. doi: 10.7759/cureus.78282. eCollection 2025 Jan.
ABSTRACT
Cardiotocography (CTG) has long been the standard method for monitoring fetal status during delivery. Despite its widespread use, human error and variability in CTG interpretation contribute to adverse neonatal outcomes, with over 70% of stillbirths, neonatal deaths, and brain injuries potentially avoidable through accurate analysis. Recent advancements in artificial intelligence (AI) offer opportunities to address these challenges by complementing human judgment. This study experimentally compared the diagnostic accuracy of AI and human specialists in predicting fetal asphyxia using CTG data. Machine learning (ML) and deep learning (DL) algorithms were developed and trained on 3,519 CTG datasets. Human specialists independently assessed 50 CTG figures each through web-based questionnaires. A total of 984 CTG figures from singleton pregnancies were evaluated, and outcomes were compared using receiver operating characteristic (ROC) analysis. Human diagnosis achieved the highest area under the curve (AUC) of 0.693 (p = 0.0003), outperforming AI-based methods (ML: AUC = 0.514, p = 0.788; DL: AUC = 0.524, p = 0.662). Although DL-assisted judgment improved sensitivity and identified cases missed by humans, it did not surpass the accuracy of human judgment alone. Combining human and AI predictions yielded a lower AUC (0.693) than human diagnosis alone, but improved specificity (91.92% for humans, 98.03% for humans and DL), highlighting AI's potential to complement human judgment by reducing false-positive rates. Our findings underscore the need for further refinement of AI algorithms and the accumulation of CTG data to enhance diagnostic accuracy. Integrating AI into clinical workflows could reduce human error, optimize resource allocation, and improve neonatal outcomes, particularly in resource-limited settings. These advancements promise a future where AI assists obstetricians in making more objective and accurate decisions during delivery.
PMID:40034878 | PMC:PMC11875211 | DOI:10.7759/cureus.78282
BandFocusNet: A Lightweight Model for Motor Imagery Classification of a Supernumerary Thumb in Virtual Reality
IEEE Open J Eng Med Biol. 2025 Feb 3;6:305-311. doi: 10.1109/OJEMB.2025.3537760. eCollection 2025.
ABSTRACT
Objective: Human movement augmentation through supernumerary effectors is an emerging field of research. However, controlling these effectors remains challenging due to issues with agency, control, and synchronizing movements with natural limbs. A promising control strategy for supernumerary effectors involves utilizing electroencephalography (EEG) through motor imagery (MI) functions. In this work, we investigate whether MI activity associated with a supernumerary effector could be reliably differentiated from that of a natural one, thus addressing the concern of concurrency. Twenty subjects were recruited to participate in a two-fold experiment in which they observed movements of natural and supernumerary thumbs, then engaged in MI of the observed movements, conducted in a virtual reality setting. Results: A lightweight deep-learning model that accounts for the temporal, spatial and spectral nature of the EEG data is proposed and called BandFocusNet, achieving an average classification accuracy of 70.9% using the leave-one-subject-out cross validation method. The trustworthiness of the model is examined through explainability analysis, and influential regions-of-interests are cross-validated through event-related-spectral-perturbation (ERSPs) analysis. Explainability results showed the importance of the right and left frontal cortical regions, and ERSPs analysis showed an increase in the delta and theta powers in these regions during the MI of the natural thumb but not during the MI of the supernumerary thumb. Conclusion: Evidence in the literature indicates that such activation is observed during the MI of natural effectors, and its absence could be interpreted as a lack of embodiment of the supernumerary thumb.
PMID:40034836 | PMC:PMC11875636 | DOI:10.1109/OJEMB.2025.3537760
Artificial intelligence in stroke risk assessment and management via retinal imaging
Front Comput Neurosci. 2025 Feb 17;19:1490603. doi: 10.3389/fncom.2025.1490603. eCollection 2025.
ABSTRACT
Retinal imaging, used for assessing stroke-related retinal changes, is a non-invasive and cost-effective method that can be enhanced by machine learning and deep learning algorithms, showing promise in early disease detection, severity grading, and prognostic evaluation in stroke patients. This review explores the role of artificial intelligence (AI) in stroke patient care, focusing on retinal imaging integration into clinical workflows. Retinal imaging has revealed several microvascular changes, including a decrease in the central retinal artery diameter and an increase in the central retinal vein diameter, both of which are associated with lacunar stroke and intracranial hemorrhage. Additionally, microvascular changes, such as arteriovenous nicking, increased vessel tortuosity, enhanced arteriolar light reflex, decreased retinal fractals, and thinning of retinal nerve fiber layer are also reported to be associated with higher stroke risk. AI models, such as Xception and EfficientNet, have demonstrated accuracy comparable to traditional stroke risk scoring systems in predicting stroke risk. For stroke diagnosis, models like Inception, ResNet, and VGG, alongside machine learning classifiers, have shown high efficacy in distinguishing stroke patients from healthy individuals using retinal imaging. Moreover, a random forest model effectively distinguished between ischemic and hemorrhagic stroke subtypes based on retinal features, showing superior predictive performance compared to traditional clinical characteristics. Additionally, a support vector machine model has achieved high classification accuracy in assessing pial collateral status. Despite this advancements, challenges such as the lack of standardized protocols for imaging modalities, hesitance in trusting AI-generated predictions, insufficient integration of retinal imaging data with electronic health records, the need for validation across diverse populations, and ethical and regulatory concerns persist. Future efforts must focus on validating AI models across diverse populations, ensuring algorithm transparency, and addressing ethical and regulatory issues to enable broader implementation. Overcoming these barriers will be essential for translating this technology into personalized stroke care and improving patient outcomes.
PMID:40034651 | PMC:PMC11872910 | DOI:10.3389/fncom.2025.1490603