Deep learning
A deep learning method based on multi-scale fusion for noise-resistant coal-gangue recognition
Sci Rep. 2025 Jan 2;15(1):101. doi: 10.1038/s41598-024-83604-z.
ABSTRACT
Coal-gangue recognition technology plays an important role in the intelligent realization of integrated working faces and coal quality improvement. However, the existing methods are easily affected by high dust, noise, and other disturbances, resulting in unstable recognition results that make it difficult to meet the needs of industrial applications. To realize accurate recognition of coal-gangue in noisy environments, this paper proposes an end-to-end multi-scale feature fusion convolutional neural network (MCNN-BILSTM) based gangue recognition method, which can automatically learn and fuse complementary information from multiple signal components of vibration signals. It combines traditional filtering methods and the idea of multi-scale learning, which can expand the breadth and depth of the feature learning process. the breadth and depth of the feature learning process. Moreover, to strengthen the expression of key features, a feature weighting method based on the attention mechanism is combined to give adaptive weights to different features. Finally, the experimental platform of a tail beam of coal-gangue impact hydraulic support is built, and several comparative experiments are carried out. The comprehensive comparison experiments show that the method shows strong adaptability, robustness, and noise resistance under various complex noise environments, and is suitable for complex practical industrial sites.
PMID:39747222 | DOI:10.1038/s41598-024-83604-z
Two-Dimensional Transition Metal Dichalcogenides: A Theory and Simulation Perspective
Chem Rev. 2025 Jan 2. doi: 10.1021/acs.chemrev.4c00628. Online ahead of print.
ABSTRACT
Two-dimensional transition metal dichalcogenides (2D TMDs) are a promising class of functional materials for fundamental physics explorations and applications in next-generation electronics, catalysis, quantum technologies, and energy-related fields. Theory and simulations have played a pivotal role in recent advancements, from understanding physical properties and discovering new materials to elucidating synthesis processes and designing novel devices. The key has been developments in ab initio theory, deep learning, molecular dynamics, high-throughput computations, and multiscale methods. This review focuses on how theory and simulations have contributed to recent progress in 2D TMDs research, particularly in understanding properties of twisted moiré-based TMDs, predicting exotic quantum phases in TMD monolayers and heterostructures, understanding nucleation and growth processes in TMD synthesis, and comprehending electron transport and characteristics of different contacts in potential devices based on TMD heterostructures. The notable achievements provided by theory and simulations are highlighted, along with the challenges that need to be addressed. Although 2D TMDs have demonstrated potential and prototype devices have been created, we conclude by highlighting research areas that demand the most attention and how theory and simulation might address them and aid in attaining the true potential of 2D TMDs toward commercial device realizations.
PMID:39746214 | DOI:10.1021/acs.chemrev.4c00628
Leveraging Artificial Intelligence/Machine Learning Models to Identify Potential Palliative Care Beneficiaries: A Systematic Review
J Gerontol Nurs. 2025 Jan;51(1):7-14. doi: 10.3928/00989134-20241210-01. Epub 2025 Jan 1.
ABSTRACT
PURPOSE: The current review examined the application of artificial intelligence (AI) and machine learning (ML) techniques in palliative care, specifically focusing on models used to identify potential beneficiaries of palliative services among individuals with chronic and terminal illnesses.
METHODS: A systematic review was conducted across four electronic databases. Five studies met inclusion criteria, all of which applied AI/ML models to predict outcomes relevant to palliative care, such as mortality or the need for services.
RESULTS: Of 1,504 studies screened, five studies used supervised ML algorithms, whereas one used natural language processing with a deep learning model to identify potential palliative care candidates. The most common AI/ML algorithms included neural network-based models, logistic regression, and tree-based models.
CONCLUSION: AI and ML models offer promising avenues for identifying palliative care beneficiaries. As AI continues to evolve, its potential to reshape palliative care through early identification is significant, providing opportunities for timely and targeted care interventions. [Journal of Gerontological Nursing, 51(1), 7-14.].
PMID:39746126 | DOI:10.3928/00989134-20241210-01
Incident duration prediction through integration of uncertainty and risk factor evaluation: A San Francisco incidents case study
PLoS One. 2025 Jan 2;20(1):e0316289. doi: 10.1371/journal.pone.0316289. eCollection 2025.
ABSTRACT
Predicting incident duration and understanding incident types are essential in traffic management for resource optimization and disruption minimization. Precise predictions enable the efficient deployment of response teams and strategic traffic rerouting, leading to reduced congestion and enhanced safety. Furthermore, an in-depth understanding of incident types helps in implementing preventive measures and formulating strategies to alleviate their influence on road networks. In this paper, we present a comprehensive framework for accurately predicting incident duration, with a particular emphasis on the critical role of street conditions and locations as major incident triggers. To demonstrate the effectiveness of our framework, we performed an in-depth case study using a dataset from San Francisco. We introduce a novel feature called "Risk" derived from the Risk Priority Number (RPN) concept, highlighting the significance of the incident location in both incident occurrence and prediction. Additionally, we propose a refined incident categorization through fuzzy clustering methods, delineating a unique policy for identifying boundary clusters that necessitate further modeling and testing under varying scenarios. Each cluster undergoes a Multiple Criteria Decision-Making (MCDM) process to gain deeper insights into their distinctions and provide valuable managerial insights. Finally, we employ both traditional Machine Learning (ML) and Deep Learning (DL) models to perform classification and regression tasks. Specifically, incidents residing in boundary clusters are predicted utilizing the scenarios outlined in this study. Through a rigorous analysis of feature importance using top-performing predictive models, we identify the "Risk" factor as a critical determinant of incident duration. Moreover, variables such as distance, humidity, and hour demonstrate significant influence, further enhancing the predictive power of the proposed model.
PMID:39746103 | DOI:10.1371/journal.pone.0316289
An investigation of feature reduction, transferability, and generalization in AWID datasets for secure Wi-Fi networks
PLoS One. 2025 Jan 2;20(1):e0306747. doi: 10.1371/journal.pone.0306747. eCollection 2025.
ABSTRACT
The widespread use of wireless networks to transfer an enormous amount of sensitive information has caused a plethora of vulnerabilities and privacy issues. The management frames, particularly authentication and association frames, are vulnerable to cyberattacks and it is a significant concern. Existing research in Wi-Fi attack detection focused on obtaining high detection accuracy while neglecting modern traffic and attack scenarios such as key reinstallation or unauthorized decryption attacks. This study proposed a novel approach using the AWID 3 dataset for cyberattack detection. The retained features were analyzed to assess their transferability, creating a lightweight and cost-effective model. A decision tree with a recursive feature elimination method was implemented for the extraction of the reduced features subset, and an additional feature wlan_radio.signal_dbm was used in combination with the extracted feature subset. Several deep learning and machine learning models were implemented, where DT and CNN achieved promising classification results. Further, feature transferability and generalizability were evaluated, and their detection performance was analyzed across different network versions where CNN outperformed other classification models. The practical implications of this research are crucial for the secure automation of wireless intrusion detection frameworks and tools in personal and enterprise paradigms.
PMID:39746088 | DOI:10.1371/journal.pone.0306747
Artificial intelligence in dentistry: Assessing the informational quality of YouTube videos
PLoS One. 2025 Jan 2;20(1):e0316635. doi: 10.1371/journal.pone.0316635. eCollection 2025.
ABSTRACT
BACKGROUND AND PURPOSE: The most widely used social media platform for video content is YouTubeTM. The present study evaluated the quality of information on YouTubeTM on artificial intelligence (AI) in dentistry.
METHODS: This cross-sectional study used YouTubeTM (https://www.youtube.com) for searching videos. The terms used for the search were "artificial intelligence in dentistry," "machine learning in dental care," and "deep learning in dentistry." The accuracy and reliability of the information source were assessed using the DISCERN score. The quality of the videos was evaluated using the modified Global Quality Score (mGQS) and the Journal of the American Medical Association (JAMA) score.
RESULTS: The analysis of 91 YouTube™ videos on AI in dentistry revealed insights into video characteristics, content, and quality. On average, videos were 22.45 minutes and received 1715.58 views and 23.79 likes. The topics were mainly centered on general dentistry (66%), with radiology (18%), orthodontics (9%), prosthodontics (4%), and implants (3%). DISCERN and mGQS scores were higher for videos uploaded by healthcare professionals and educational content videos(P<0.05). DISCERN exhibited a strong correlation (0.75) with the video source and with JAMA (0.77). The correlation of the video's content and mGQS, was 0.66 indicated moderate correlation.
CONCLUSION: YouTube™ has informative and moderately reliable videos on AI in dentistry. Dental students, dentists and patients can use these videos to learn and educate about artificial intelligence in dentistry. Professionals should upload more videos to enhance the reliability of the content.
PMID:39746083 | DOI:10.1371/journal.pone.0316635
A phase transition in diffusion models reveals the hierarchical nature of data
Proc Natl Acad Sci U S A. 2025 Jan 7;122(1):e2408799121. doi: 10.1073/pnas.2408799121. Epub 2025 Jan 2.
ABSTRACT
Understanding the structure of real data is paramount in advancing modern deep-learning methodologies. Natural data such as images are believed to be composed of features organized in a hierarchical and combinatorial manner, which neural networks capture during learning. Recent advancements show that diffusion models can generate high-quality images, hinting at their ability to capture this underlying compositional structure. We study this phenomenon in a hierarchical generative model of data. We find that the backward diffusion process acting after a time t is governed by a phase transition at some threshold time, where the probability of reconstructing high-level features, like the class of an image, suddenly drops. Instead, the reconstruction of low-level features, such as specific details of an image, evolves smoothly across the whole diffusion process. This result implies that at times beyond the transition, the class has changed, but the generated sample may still be composed of low-level elements of the initial image. We validate these theoretical insights through numerical experiments on class-unconditional ImageNet diffusion models. Our analysis characterizes the relationship between time and scale in diffusion models and puts forward generative models as powerful tools to model combinatorial data properties.
PMID:39746044 | DOI:10.1073/pnas.2408799121
Exploring happiness factors with explainable ensemble learning in a global pandemic
PLoS One. 2025 Jan 2;20(1):e0313276. doi: 10.1371/journal.pone.0313276. eCollection 2025.
ABSTRACT
Happiness is a state of contentment, joy, and fulfillment, arising from relationships, accomplishments, and inner peace, leading to well-being and positivity. The greatest happiness principle posits that morality is determined by pleasure, aiming for a society where individuals are content and free from suffering. While happiness factors vary, some are universally recognized. The World Happiness Report (WHR), published annually, includes data on 'GDP per capita', 'social support', 'life expectancy', 'freedom to make life choices', 'generosity', and 'perceptions of corruption'. This paper predicts happiness scores using Machine Learning (ML), Deep Learning (DL), and ensemble ML and DL algorithms and examines the impact of individual variables on the happiness index. We also show the impact of COVID-19 pandemic on the happiness features. We design two ensemble ML and DL models using blending and stacking ensemble techniques, namely, Blending RGMLL, which combines Ridge Regression (RR), Gradient Boosting (GB), Multilayer Perceptron (MLP), Long Short-Term Memory (LSTM), and Linear Regression (LR), and Stacking LRGR, which combines LR, Random Forest (RF), GB, and RR. Among the trained models, Blending RGMLL demonstrates the highest predictive accuracy with R2 of 85%, MSE of 0.15, and RMSE of 0.38. We employ Explainable Artificial Intelligence (XAI) techniques to uncover changes in happiness indices, variable importance, and the impact of the COVID-19 pandemic on happiness. The study utilizes an open dataset from the WHR, covering 156 countries from 2018 to 2023. Our findings indicate that 'GDP per capita' is the most critical indicator of happiness score (HS), while 'social support' and 'healthy life expectancy' are also important features before and after the pandemic. However, during the pandemic, 'social support' emerged as the most important indicator, followed by 'healthy life expectancy' and 'GDP per capita', because social support is the prime necessity in the pandemic situation. The outcome of this research helps people understand the impact of these features on increasing the HS and provides guidelines on how happiness can be maintain during unwanted situations. Future research will explore advanced methods and include other related features with real-time monitoring for more comprehensive insights.
PMID:39746025 | DOI:10.1371/journal.pone.0313276
Predicting learning achievement using ensemble learning with result explanation
PLoS One. 2025 Jan 2;20(1):e0312124. doi: 10.1371/journal.pone.0312124. eCollection 2025.
ABSTRACT
Predicting learning achievement is a crucial strategy to address high dropout rates. However, existing prediction models often exhibit biases, limiting their accuracy. Moreover, the lack of interpretability in current machine learning methods restricts their practical application in education. To overcome these challenges, this research combines the strengths of various machine learning algorithms to design a robust model that performs well across multiple metrics, and uses interpretability analysis to elucidate the prediction results. This study introduces a predictive framework for learning achievement based on ensemble learning techniques. Specifically, six distinct machine learning models are utilized to establish a base learner, with logistic regression serving as the meta learner to construct an ensemble model for predicting learning achievement. The SHapley Additive exPlanation (SHAP) model is then employed to explain the prediction results. Through the experiments on XuetangX dataset, the effectiveness of the proposed model is verified. The proposed model outperforms traditional machine learning and deep learning model in terms of prediction accuracy. The results demonstrate that the ensemble learning-based predictive framework significantly outperforms traditional machine learning methods. Through feature importance analysis, the SHAP method enhances model interpretability and improves the reliability of the prediction results, enabling more personalized interventions to support students.
PMID:39745993 | DOI:10.1371/journal.pone.0312124
Deep Learning-Based SD-OCT Layer Segmentation Quantifies Outer Retina Changes in Patients With Biallelic RPE65 Mutations Undergoing Gene Therapy
Invest Ophthalmol Vis Sci. 2025 Jan 2;66(1):5. doi: 10.1167/iovs.66.1.5.
ABSTRACT
PURPOSE: To quantify outer retina structural changes and define novel biomarkers of inherited retinal degeneration associated with biallelic mutations in RPE65 (RPE65-IRD) in patients before and after subretinal gene augmentation therapy with voretigene neparvovec (Luxturna).
METHODS: Application of advanced deep learning for automated retinal layer segmentation, specifically tailored for RPE65-IRD. Quantification of five novel biomarkers for the ellipsoid zone (EZ): thickness, granularity, reflectivity, and intensity. Estimation of the EZarea in single and volume scans was performed with optimized segmentation boundaries. The control group was age similar and without significant refractive error. Spherical equivalent refraction and ocular length were evaluated in all patients.
RESULTS: We observed significant differences in the structural analysis of EZ biomarkers in 22 patients with RPE65-IRD compared with 94 healthy controls. Relative EZ intensities were already reduced in pediatric eyes. Reductions of EZ local granularity and EZ thickness were only significant in adult eyes. Distances of the outer plexiform layer, external limiting membrane, and Bruch's membrane to EZ were reduced at all ages. EZ diameter and area were better preserved in pediatric eyes undergoing therapy with voretigene neparvovec and in patients with a milder phenotype.
CONCLUSIONS: Automated quantitative analysis of biomarkers within EZ visualizes distinct structural differences in the outer retina of patients including treatment-related effects. The automated approach using deep learning strategies allows big data analysis for distinct forms of inherited retinal degeneration. Limitations include a small dataset and potential effects on OCT scans from myopia at least -5 diopters, the latter considered nonsignificant for outer retinal layers.
PMID:39745677 | DOI:10.1167/iovs.66.1.5
Optimized smFISH Pipeline for Studying Nascent Transcription in Mouse Embryonic Tissue Samples
Methods Mol Biol. 2025;2889:53-66. doi: 10.1007/978-1-0716-4322-8_5.
ABSTRACT
Understanding the spatial and temporal dynamics of gene expression is crucial for unraveling molecular mechanisms underlying various biological processes. While traditional methods have offered insights into gene expression patterns, they primarily focus on mature mRNA transcripts, lacking real-time visualization of newly synthesized or nascent transcription events. Recent advancements in monitoring nascent transcription in live cells provide valuable insights into transcriptional dynamics. However, such approaches are limited in mammalian embryos. Addressing this gap, we optimized a single molecule fluorescent in situ hybridization (smFISH) technique and coupled it with deep learning algorithms to automate detection of nascent transcription in mouse embryonic tissue samples. Our method enables precise quantification and comparison of nascent transcripts within tissue sections, offering reproducible results and potential applications in studying gene expression dynamics across various developmental stages.
PMID:39745605 | DOI:10.1007/978-1-0716-4322-8_5
Artificial intelligence-based cardiovascular/stroke risk stratification in women affected by autoimmune disorders: a narrative survey
Rheumatol Int. 2025 Jan 2;45(1):14. doi: 10.1007/s00296-024-05756-5.
ABSTRACT
Women are disproportionately affected by chronic autoimmune diseases (AD) like systemic lupus erythematosus (SLE), scleroderma, rheumatoid arthritis (RA), and Sjögren's syndrome. Traditional evaluations often underestimate the associated cardiovascular disease (CVD) and stroke risk in women having AD. Vitamin D deficiency increases susceptibility to these conditions. CVD risk prediction in AD can benefit from surrogate biomarker for coronary artery disease (CAD), such as carotid ultrasound. Due to non-linearity in the CVD risk stratification, we use artificial intelligence-based system using AD biomarkers and carotid ultrasound. Investigate the relationship between AD and CVD/stroke markers including autoantibody-influenced plaque load. Second, to study the surrogate biomarkers for the CAD and gather radiomics-based features such as carotid intima-media thickness (cIMT), and plaque area (PA). Third and final, explore the automated CVD/stroke risk identification using advanced machine learning (ML) and deep learning (DL) paradigms. Analysed biomarker data from women with AD, including carotid ultrasonography imaging, clinical parameters, autoantibody profiles, and vitamin D levels. Proposed artificial intelligence (AI) models to predict CVD/stroke risk accurately in AD for women. There is a strong association between AD duration and elevated cIMT/PA, with increased CVD risk linked to higher rheumatoid factor (RF) and anti-citrullinated peptide antibodies (ACPAs) levels. AI models outperformed conventional methods by integrating imaging data and disorder-specific factors. Interdisciplinary collaboration is crucial for managing CVD/stroke in women with chronic autoimmune diseases. AI-based assisted risk stratification methods may improve treatment decision-making and cardiovascular outcomes.
PMID:39745536 | DOI:10.1007/s00296-024-05756-5
NMRformer: A Transformer-Based Deep Learning Framework for Peak Assignment in 1D (1)H NMR Spectroscopy
Anal Chem. 2025 Jan 2. doi: 10.1021/acs.analchem.4c05632. Online ahead of print.
ABSTRACT
Metabolite identification from 1D 1H NMR spectra is a major challenge in NMR-based metabolomics. This study introduces NMRformer, a Transformer-based deep learning framework for accurate peak assignment and metabolite identification in 1D 1H NMR spectroscopy. Unlike traditional approaches, NMRformer interprets spectra as sequences of spectral peaks and integrates a self-attention mechanism and peak height ratios directly into the Transformer encoder layer. It has the capability to recognize and interpret long-range dependencies between peaks and to quickly identify peaks corresponding to identical metabolites. The effectiveness of NMRformer has been rigorously validated by analyzing real 1D 1H NMR spectra from a variety of cellular and biofluid samples. NMRformer achieved peak assignment accuracies above 88% and metabolite identification accuracies above 80% in four types of cellular samples. It also achieved peak assignment accuracies above 88% and metabolite identification accuracies above 80% in three types of biofluid samples. These results underscore the ability of NMRformer to significantly improve the accuracy and efficiency of peak assignment and metabolite identification in NMR-based metabolomics studies.
PMID:39745381 | DOI:10.1021/acs.analchem.4c05632
Deep-learning-enhanced modeling of electrosprayed particle assembly on non-spherical droplet surfaces
Soft Matter. 2025 Jan 2. doi: 10.1039/d4sm01160k. Online ahead of print.
ABSTRACT
Monolayer assembly of charged colloidal particles at liquid interfaces opens a new avenue for advancing the additive manufacturing of thin film materials and devices with tailored properties. In this study, we investigated the dynamics of electrosprayed colloidal particles at curved droplet interfaces through a combination of physics-based computational simulations and machine learning. We employed a novel mesh-constrained Brownian dynamics (BD) algorithm coupled with Ansys® electric field simulations to model the transport and assembly of charged particles on a non-spherical droplet surface. We demonstrated that the electrostatic repulsion between particles, electrophoretic forces induced by substrate surface charge, and Brownian motion are the key factors influencing the compactness and ordering of the assembly structure. We further trained a deep neural network surrogate model using the data generated from the BD simulations to predict radial distribution functions (RDF) of particle assembly. By coupling the surrogate model with Bayesian optimization, we identified the optimal particle and substrate charge densities that yield the best match between the simulation and experimental assembly. Using the optimal charge densities, the RDF profile of the simulated assembly accurately matches the experiment with a similarity of 96.4%, and the corresponding average bond order parameter differs by less than 5% from the experimental one. This deep-learning-based approach significantly reduces computational time while maintaining high accuracy in predicting the important features of the assembly structures. The charge densities inferred from the modeling provide critical insights into the surface charge accumulation in the electrospray process.
PMID:39745220 | DOI:10.1039/d4sm01160k
Automated Cone Beam Computed Tomography Segmentation of Multiple Impacted Teeth With or Without Association to Rare Diseases: Evaluation of Four Deep Learning-Based Methods
Orthod Craniofac Res. 2025 Jan 2. doi: 10.1111/ocr.12890. Online ahead of print.
ABSTRACT
OBJECTIVE: To assess the accuracy of three commercially available and one open-source deep learning (DL) solutions for automatic tooth segmentation in cone beam computed tomography (CBCT) images of patients with multiple dental impactions.
MATERIALS AND METHODS: Twenty patients (20 CBCT scans) were selected from a retrospective cohort of individuals with multiple dental impactions. For each CBCT scan, one reference segmentation and four DL segmentations of the maxillary and mandibular teeth were obtained. Reference segmentations were generated by experts using a semi-automatic process. DL segmentations were automatically generated according to the manufacturer's instructions. Quantitative and qualitative evaluations of each DL segmentation were performed by comparing it with expert-generated segmentation. The quantitative metrics used were Dice similarity coefficient (DSC) and the normalized surface distance (NSD).
RESULTS: The patients had an average of 12 retained teeth, with 12 of them diagnosed with a rare disease. DSC values ranged from 88.5% ± 3.2% to 95.6% ± 1.2%, and NSD values ranged from 95.3% ± 2.7% to 97.4% ± 6.5%. The number of completely unsegmented teeth ranged from 1 (0.1%) to 41 (6.0%). Two solutions (Diagnocat and DentalSegmentator) outperformed the others across all tested parameters.
CONCLUSION: All the tested methods showed a mean NSD of approximately 95%, proving their overall efficiency for tooth segmentation. The accuracy of the methods varied among the four tested solutions owing to the presence of impacted teeth in our CBCT scans. DL solutions are evolving rapidly, and their future performance cannot be predicted based on our results.
PMID:39744906 | DOI:10.1111/ocr.12890
The Biomedical Applications of Artificial Intelligence: An Overview of Decades of Research
J Drug Target. 2025 Jan 2:1-85. doi: 10.1080/1061186X.2024.2448711. Online ahead of print.
ABSTRACT
A significant area of computer science called artificial intelligence (AI) is successfully applied to the analysis of intricate biological data and the extraction of substantial associations from datasets for a variety of biomedical uses. AI has attracted significant interest in biomedical research due to its features: (i) better patient care through early diagnosis and detection; (ii) enhanced workflow; (iii) lowering medical errors; (v) lowering medical costs; (vi) reducing morbidity and mortality; (vii) enhancing performance; (viii) enhancing precision; and (ix) time efficiency. Quantitative metrics are crucial for evaluating AI implementations, providing insights, enabling informed decisions, and measuring the impact of AI-driven initiatives, thereby enhancing transparency, accountability, and overall impact. The implementation of AI in biomedical fields faces challenges such as ethical and privacy concerns, lack of awareness, technology unreliability, and professional liability. A brief discussion is given of the AI techniques, which include Virtual screening (VS), DL, ML, Hidden Markov models (HMMs), Neural networks (NNs), Generative models (GMs), Molecular dynamics (MD), and Structure-activity relationship (SAR) models. The study explores the application of AI in biomedical fields, highlighting its enhanced predictive accuracy, treatment efficacy, diagnostic efficiency, faster decision-making, personalized treatment strategies, and precise medical interventions.
PMID:39744873 | DOI:10.1080/1061186X.2024.2448711
Automatic Segmentation of Vestibular Schwannoma From MRI Using Two Cascaded Deep Learning Networks
Laryngoscope. 2025 Jan 2. doi: 10.1002/lary.31979. Online ahead of print.
ABSTRACT
OBJECTIVE: Automatic segmentation and detection of vestibular schwannoma (VS) in MRI by deep learning is an upcoming topic. However, deep learning faces generalization challenges due to tumor variability even though measurements and segmentation of VS are essential for growth monitoring and treatment planning. Therefore, we introduce a novel model combining two Convolutional Neural Network (CNN) models for the detection of VS by deep learning aiming to improve performance of automatic segmentation.
METHODS: Deep learning techniques have been employed for automatic VS tumor segmentation, including 2D, 2.5D, and 3D UNet-like architectures, which is a specific CNN designed to improve automatic segmentation for medical imaging. Specifically, we introduce a sequential connection where the first UNet's predicted segmentation map is passed to a second complementary network for refinement. Additionally, spatial attention mechanisms are utilized to further guide refinement in the second network.
RESULTS: We conducted experiments on both public and private datasets containing contrast-enhanced T1 and high-resolution T2-weighted magnetic resonance imaging (MRI). Across the public dataset, we observed consistent improvements in Dice scores for all variants of 2D, 2.5D, and 3D CNN methods, with a notable enhancement of 8.86% for the 2D UNet variant on T1. In our private dataset, a 3.75% improvement was reported for 2D T1. Moreover, we found that T1 images generally outperformed T2 in VS segmentation.
CONCLUSION: We demonstrate that sequential connection of UNets combined with spatial attention mechanisms enhances VS segmentation performance across state-of-the-art 2D, 2.5D, and 3D deep learning methods.
LEVEL OF EVIDENCE: 3 Laryngoscope, 2024.
PMID:39744768 | DOI:10.1002/lary.31979
Utilization of artificial intelligence in the diagnosis of pes planus and pes cavus with a smartphone camera
World J Orthop. 2024 Dec 18;15(12):1146-1154. doi: 10.5312/wjo.v15.i12.1146. eCollection 2024 Dec 18.
ABSTRACT
BACKGROUND: Pes planus (flatfoot) and pes cavus (high arch foot) are common foot deformities, often requiring clinical and radiographic assessment for diagnosis and potential subsequent management. Traditional diagnostic methods, while effective, pose limitations such as cost, radiation exposure, and accessibility, particularly in underserved areas.
AIM: To develop deep learning algorithms that detect and classify such deformities using smartphone cameras.
METHODS: An algorithm that integrated a deep convolutional neural network (CNN) into a smartphone camera was utilized to detect pes planus and pes cavus deformities. This case control study was conducted at a tertiary hospital with participants recruited from two orthopaedic foot and ankle clinics. The CNN was trained and tested using photographs of the medial aspect of participants' feet, taken under standardized conditions. Participants included subjects with standard foot alignment, pes planus, or pes cavus determined by an expert clinician using the foot posture index. The model's performance was assessed in comparison to clinical assessment and radiographic measurements, specifically lateral tarsal-first metatarsal angle and calcaneal inclination angle.
RESULTS: The CNN model demonstrated high accuracy in diagnosing both pes planus and pes cavus, with an optimized area under the curve of 0.90 for pes planus and 0.90 for pes cavus. It showed a specificity and sensitivity of 84% and 87% for pes planus detection, respectively; and 97% and 70% for pes cavus, respectively. The model's prediction correlated moderately with radiographic lateral Meary's angle measurements, indicating the model's excellent reliability in assessing food arch deformity (P < 0.05).
CONCLUSION: This study highlights the potential of using a smartphone-based CNN model as a screening tool that is reliable and accessible for the detection of pes planus and pes cavus deformities, which is especially beneficial for underserved communities and patients with pain generated by subtle foot arch deformities.
PMID:39744730 | PMC:PMC11686530 | DOI:10.5312/wjo.v15.i12.1146
Alleviating the medical strain: a triage method via cross-domain text classification
Front Comput Neurosci. 2024 Dec 18;18:1468519. doi: 10.3389/fncom.2024.1468519. eCollection 2024.
ABSTRACT
It is a universal phenomenon for patients who do not know which clinical department to register in large general hospitals. Although triage nurses can help patients, due to the larger number of patients, they have to stand in a queue for minutes to consult. Recently, there have already been some efforts to devote deep-learning techniques or pre-trained language models (PLMs) to triage recommendations. However, these methods may suffer two main limitations: (1) These methods typically require a certain amount of labeled or unlabeled data for model training, which are not always accessible and costly to acquire. (2) These methods have not taken into account the distortion of semantic feature structure and the loss of category discriminability in the model training. To overcome these limitations, in this study, we propose a cross-domain text classification method based on prompt-tuning, which can classify patients' questions or texts about their symptoms into several given categories to give suggestions on which kind of consulting room patients could choose. Specifically, first, different prompt templates are manually crafted based on various data contents, embedding source domain information into the prompt templates to generate another text with similar semantic feature structures for performing classification tasks. Then, five different strategies are employed to expand the label word space for modifying prompts, and the integration of these strategies is used as the final verbalizer. The extensive experiments on Chinese Triage datasets demonstrate that our method achieved state-of-the-art performance.
PMID:39744724 | PMC:PMC11688176 | DOI:10.3389/fncom.2024.1468519
Multimodal sleep staging network based on obstructive sleep apnea
Front Comput Neurosci. 2024 Dec 18;18:1505746. doi: 10.3389/fncom.2024.1505746. eCollection 2024.
ABSTRACT
BACKGROUND: Automatic sleep staging is essential for assessing sleep quality and diagnosing sleep disorders. While previous research has achieved high classification performance, most current sleep staging networks have only been validated in healthy populations, ignoring the impact of Obstructive Sleep Apnea (OSA) on sleep stage classification. In addition, it remains challenging to effectively improve the fine-grained detection of polysomnography (PSG) and capture multi-scale transitions between sleep stages. Therefore, a more widely applicable network is needed for sleep staging.
METHODS: This paper introduces MSDC-SSNet, a novel deep learning network for automatic sleep stage classification. MSDC-SSNet transforms two channels of electroencephalogram (EEG) and one channel of electrooculogram (EOG) signals into time-frequency representations to obtain feature sequences at different temporal and frequency scales. An improved Transformer encoder architecture ensures temporal consistency and effectively captures long-term dependencies in EEG and EOG signals. The Multi-Scale Feature Extraction Module (MFEM) employs convolutional layers with varying dilation rates to capture spatial patterns from fine to coarse granularity. It adaptively fuses the weights of features to enhance the robustness of the model. Finally, multiple channel data are integrated to address the heterogeneity between different modalities effectively and alleviate the impact of OSA on sleep stages.
RESULTS: We evaluated MSDC-SSNet on three public datasets and our collection of PSG records of 17 OSA patients. It achieved an accuracy of 80.4% on the OSA dataset. It also outperformed the state-of-the-art methods in terms of accuracy, F1 score, and Cohen's Kappa coefficient on the remaining three datasets.
CONCLUSION: The MSDC-SSRNet multi-channel sleep staging architecture proposed in this study enhances widespread system applicability by supplementing inter-channel features. It employs multi-scale attention to extract transition rules between sleep stages and effectively integrates multimodal information. Our method address the limitations of single-channel approaches, enhancing interpretability for clinical applications.
PMID:39744723 | PMC:PMC11688327 | DOI:10.3389/fncom.2024.1505746