Deep learning
AI in evaluating ambulation of stroke patients: severity classification with video and functional ambulation category scale
Top Stroke Rehabil. 2024 Jun 6:1-9. doi: 10.1080/10749357.2024.2359342. Online ahead of print.
ABSTRACT
BACKGROUND: The evaluation of gait function and severity classification of stroke patients are important to determine the rehabilitation goal and the level of exercise. Physicians often qualitatively evaluate patients' walking ability through visual gait analysis using naked eye, video images, or standardized assessment tools. Gait evaluation through observation relies on the doctor's empirical judgment, potentially introducing subjective opinions. Therefore, conducting research to establish a basis for more objective judgment is crucial.
OBJECTIVE: To verify a deep learning model that classifies gait image data of stroke patients according to Functional Ambulation Category (FAC) scale.
METHODS: Gait vision data from 203 stroke patients and 182 healthy individuals recruited from six medical institutions were collected to train a deep learning model for classifying gait severity in stroke patients. The recorded videos were processed using OpenPose. The dataset was randomly split into 80% for training and 20% for testing.
RESULTS: The deep learning model attained a training accuracy of 0.981 and test accuracy of 0.903. Area Under the Curve(AUC) values of 0.93, 0.95, and 0.96 for discriminating among the mild, moderate, and severe stroke groups, respectively.
CONCLUSION: This confirms the potential of utilizing human posture estimation based on vision data not only to develop gait parameter models but also to develop models to classify severity according to the FAC criteria used by physicians. To develop an AI-based severity classification model, a large amount and variety of data is necessary and data collected in non-standardized real environments, not in laboratories, can also be used meaningfully.
PMID:38841903 | DOI:10.1080/10749357.2024.2359342
Utilization of Artificial Intelligence in Minimally Invasive Right Adrenalectomy: Recognition of Anatomical Landmarks with Deep Learning
Acta Chir Belg. 2024 Jun 6:1-8. doi: 10.1080/00015458.2024.2363599. Online ahead of print.
ABSTRACT
BackgroundThe primary surgical approach for removing adrenal masses is minimally invasive adrenalectomy. Recognition of anatomical landmarks during surgery is critical for minimizing complications. Artificial intelligence-based tools can be utilized to create real-time navigation systems during laparoscopic and robotic right adrenalectomy. In this study, we aimed to develop deep learning models that can identify critical anatomical structures during minimally invasive right adrenalectomy.MethodsIn this experimental feasibility study, intraoperative videos of 20 patients who underwent minimally invasive right adrenalectomy in a tertiary care center between 2011 and 2023 were analyzed and used to develop an artificial intelligence-based anatomical landmark recognition system. Semantic segmentation of the liver, the inferior vena cava (IVC), and the right adrenal gland were performed. Fifty random images per patient during the dissection phase were extracted from videos. The experiments on the annotated images were performed on two state-of-the-art segmentation models named SwinUNETR and MedNeXt, which are transformer and convolutional neural network (CNN)-based segmentation architectures, respectively. Two loss function combinations, Dice-Cross Entropy and Dice-Focal Loss were experimented with for both of the models. The dataset was split into training and validation subsets with an 80:20 distribution on a patient basis in a 5-fold cross-validation approach. To introduce a sample variability to the dataset, strong-augmentation techniques were performed using intensity modifications and perspective transformations to represent different surgery environment scenarios. The models were evaluated by Dice Similarity Coefficient (DSC) and Intersection over Union (IoU) which are widely used segmentation metrics. For pixel-wise classification performance, Accuracy, Sensitivity and Specificity metrics were calculated on the validation subset.ResultsOut of 20 videos, 1000 images were extracted, and the anatomical landmarks (liver, IVC, and right adrenal gland) were annotated. Randomly distributed 800 images and 200 images were selected for the training and validation subsets, respectively. Our benchmark results show that the utilization of Dice-Cross Entropy Loss with the transformer-based SwinUNETR model achieved 78.37% whereas the CNN-based MedNeXt model reached a 77.09% mDSC score. Conversely, MedNeXt reaches a higher mIoU score of 63.71% than SwinUNETR by 62.10% on a three-region prediction task.ConclusionArtificial intelligence-based systems can predict anatomical landmarks with high performance in minimally invasive right adrenalectomy. Such tools can later be used to create real-time navigation systems during surgery in the near future.
PMID:38841838 | DOI:10.1080/00015458.2024.2363599
Assistive tools for classifying neurological disorders using fMRI and deep learning: A guide and example
Brain Behav. 2024 Jun;14(6):e3554. doi: 10.1002/brb3.3554.
ABSTRACT
BACKGROUND: Deep-learning (DL) methods are rapidly changing the way researchers classify neurological disorders. For example, combining functional magnetic resonance imaging (fMRI) and DL has helped researchers identify functional biomarkers of neurological disorders (e.g., brain activation and connectivity) and pilot innovative diagnostic models. However, the knowledge required to perform DL analyses is often domain-specific and is not widely taught in the brain sciences (e.g., psychology, neuroscience, and cognitive science). Conversely, neurological diagnoses and neuroimaging training (e.g., fMRI) are largely restricted to the brain and medical sciences. In turn, these disciplinary knowledge barriers and distinct specializations can act as hurdles that prevent the combination of fMRI and DL pipelines. The complexity of fMRI and DL methods also hinders their clinical adoption and generalization to real-world diagnoses. For example, most current models are not designed for clinical settings or use by nonspecialized populations such as students, clinicians, and healthcare workers. Accordingly, there is a growing area of assistive tools (e.g., software and programming packages) that aim to streamline and increase the accessibility of fMRI and DL pipelines for the diagnoses of neurological disorders.
OBJECTIVES AND METHODS: In this study, we present an introductory guide to some popular DL and fMRI assistive tools. We also create an example autism spectrum disorder (ASD) classification model using assistive tools (e.g., Optuna, GIFT, and the ABIDE preprocessed repository), fMRI, and a convolutional neural network.
RESULTS: In turn, we provide researchers with a guide to assistive tools and give an example of a streamlined fMRI and DL pipeline.
CONCLUSIONS: We are confident that this study can help more researchers enter the field and create accessible fMRI and deep-learning diagnostic models for neurological disorders.
PMID:38841732 | DOI:10.1002/brb3.3554
Multi-risk factors joint prediction model for risk prediction of retinopathy of prematurity
EPMA J. 2024 May 9;15(2):261-274. doi: 10.1007/s13167-024-00363-7. eCollection 2024 Jun.
ABSTRACT
PURPOSE: Retinopathy of prematurity (ROP) is a retinal vascular proliferative disease common in low birth weight and premature infants and is one of the main causes of blindness in children.In the context of predictive, preventive and personalized medicine (PPPM/3PM), early screening, identification and treatment of ROP will directly contribute to improve patients' long-term visual prognosis and reduce the risk of blindness. Thus, our objective is to establish an artificial intelligence (AI) algorithm combined with clinical demographics to create a risk model for ROP including treatment-requiring retinopathy of prematurity (TR-ROP) infants.
METHODS: A total of 22,569 infants who underwent routine ROP screening in Shenzhen Eye Hospital from March 2003 to September 2023 were collected, including 3335 infants with ROP and 1234 infants with TR-ROP among ROP infants. Two machine learning methods of logistic regression and decision tree and a deep learning method of multi-layer perceptron were trained by using the relevant combination of risk factors such as birth weight (BW), gestational age (GA), gender, whether multiple births (MB) and mode of delivery (MD) to achieve the risk prediction of ROP and TR-ROP. We used five evaluation metrics to evaluate the performance of the risk prediction model. The area under the receiver operating characteristic curve (AUC) and the area under the precision-recall curve (AUCPR) were the main measurement metrics.
RESULTS: In the risk prediction for ROP, the BW + GA demonstrated the optimal performance (mean ± SD, AUCPR: 0.4849 ± 0.0175, AUC: 0.8124 ± 0.0033). In the risk prediction of TR-ROP, reasonable performance can be achieved by using GA + BW + Gender + MD + MB (AUCPR: 0.2713 ± 0.0214, AUC: 0.8328 ± 0.0088).
CONCLUSIONS: Combining risk factors with AI in screening programs for ROP could achieve risk prediction of ROP and TR-ROP, detect TR-ROP earlier and reduce the number of ROP examinations and unnecessary physiological stress in low-risk infants. Therefore, combining ROP-related biometric information with AI is a cost-effective strategy for predictive diagnostic, targeted prevention, and personalization of medical services in early screening and treatment of ROP.
PMID:38841619 | PMC:PMC11147992 | DOI:10.1007/s13167-024-00363-7
Deep learning based retinal vessel segmentation and hypertensive retinopathy quantification using heterogeneous features cross-attention neural network
Front Med (Lausanne). 2024 May 22;11:1377479. doi: 10.3389/fmed.2024.1377479. eCollection 2024.
ABSTRACT
Retinal vessels play a pivotal role as biomarkers in the detection of retinal diseases, including hypertensive retinopathy. The manual identification of these retinal vessels is both resource-intensive and time-consuming. The fidelity of vessel segmentation in automated methods directly depends on the fundus images' quality. In instances of sub-optimal image quality, applying deep learning-based methodologies emerges as a more effective approach for precise segmentation. We propose a heterogeneous neural network combining the benefit of local semantic information extraction of convolutional neural network and long-range spatial features mining of transformer network structures. Such cross-attention network structure boosts the model's ability to tackle vessel structures in the retinal images. Experiments on four publicly available datasets demonstrate our model's superior performance on vessel segmentation and the big potential of hypertensive retinopathy quantification.
PMID:38841586 | PMC:PMC11150614 | DOI:10.3389/fmed.2024.1377479
Artificial intelligence in drug repurposing for rare diseases: a mini-review
Front Med (Lausanne). 2024 May 22;11:1404338. doi: 10.3389/fmed.2024.1404338. eCollection 2024.
ABSTRACT
Drug repurposing, the process of identifying new uses for existing drugs beyond their original indications, offers significant advantages in terms of reduced development time and costs, particularly in addressing unmet medical needs in rare diseases. Artificial intelligence (AI) has emerged as a transformative force in healthcare, and by leveraging AI technologies, researchers aim to overcome some of the challenges associated with rare diseases. This review presents concrete case studies, as well as pre-existing platforms, initiatives, and companies that demonstrate the application of AI for drug repurposing in rare diseases. Despite representing a modest part of the literature compared to other diseases such as COVID-19 or cancer, the growing interest, and investment in AI for drug repurposing in rare diseases underscore its potential to accelerate treatment availability for patients with unmet medical needs.
PMID:38841574 | PMC:PMC11150798 | DOI:10.3389/fmed.2024.1404338
Application of deep learning classification model for regional evaluation of roof pressure support evolution effects over time in coal mining face
Heliyon. 2024 May 23;10(11):e31824. doi: 10.1016/j.heliyon.2024.e31824. eCollection 2024 Jun 15.
ABSTRACT
Hydraulic support leg pressure serves as a crucial indicator for assessing work face support quality. Current evaluation methods for support quality primarily concentrate on static analyses-like inadequate initial support force, pressure overrun, and uneven bracket force-while neglecting dynamic column pressure changes. This paper introduces a model for assessing hydraulic support quality using deep learning techniques. Real-time data is preprocessed into a spatio-temporal pressure sub-matrix sample, which is then inputted into the model. This process assesses the support quality type and characterizes its dynamic evolution within the area. The model facilitates the identification of dynamic support quality effects in the working face area, aiding operators in making targeted adjustments to hydraulic support status. Experimental results revealed that the optimized LeNet-5 network-adjusting parameters like convolutional layer count, kernel size, and ReLU activation function-achieved the highest classification accuracy of 85.25 % for support quality, surpassed other networks. Furthermore, the improved LeNet-5 network outperformed other networks in both F1 score and recall. Additionally, the improved LeNet-5 network achieved faster convergence to the optimal solution, accelerated training speed. This highlighted its advantages in evaluating the spatio-temporal support quality of hydraulic supports in smart mining operations.
PMID:38841511 | PMC:PMC11152685 | DOI:10.1016/j.heliyon.2024.e31824
A deep learning approach based on graphs to detect plantation lines
Heliyon. 2024 May 23;10(11):e31730. doi: 10.1016/j.heliyon.2024.e31730. eCollection 2024 Jun 15.
ABSTRACT
Identifying plantation lines in aerial images of agricultural landscapes is re-quired for many automatic farming processes. Deep learning-based networks are among the most prominent methods to learn such patterns and extract this type of information from diverse imagery conditions. However, even state-of-the-art methods may stumble in complex plantation patterns. Here, we propose a deep learning approach based on graphs to detect plantation lines in UAV-based RGB imagery, presenting a challenging scenario containing spaced plants. The first module of our method extracts a feature map throughout the backbone, which consists of the initial layers of the VGG16. This feature map is used as an input to the Knowledge Estimation Module (KEM), organized in three concatenated branches for detecting 1) the plant positions, 2) the plantation lines, and 3) the displacement vectors between the plants. A graph modeling is applied considering each plant position on the image as vertices, and edges are formed between two vertices (i.e. plants). Finally, the edge is classified as pertaining to a certain plantation line based on three probabilities (higher than 0.5): i) in visual features obtained from the backbone; ii) a chance that the edge pixels belong to a line, from the KEM step; and iii) an alignment of the displacement vectors with the edge, also from the KEM step. Experiments were conducted initially in corn plantations with different growth stages and patterns with aerial RGB imagery to present the advantages of adopting each module. We assessed the generalization capability in the other two cultures (orange and eucalyptus) datasets. The proposed method was compared against state-of-the-art deep learning methods and achieved superior performance with a significant margin considering all three datasets. This approach is useful in extracting lines with spaced plantation patterns and could be implemented in scenarios where plantation gaps occur, generating lines with few-to-no interruptions.
PMID:38841473 | PMC:PMC11152659 | DOI:10.1016/j.heliyon.2024.e31730
Distribution-informed and wavelength-flexible data-driven photoacoustic oximetry
J Biomed Opt. 2024 Jun;29(Suppl 3):S33303. doi: 10.1117/1.JBO.29.S3.S33303. Epub 2024 Jun 5.
ABSTRACT
SIGNIFICANCE: Photoacoustic imaging (PAI) promises to measure spatially resolved blood oxygen saturation but suffers from a lack of accurate and robust spectral unmixing methods to deliver on this promise. Accurate blood oxygenation estimation could have important clinical applications from cancer detection to quantifying inflammation.
AIM: We address the inflexibility of existing data-driven methods for estimating blood oxygenation in PAI by introducing a recurrent neural network architecture.
APPROACH: We created 25 simulated training dataset variations to assess neural network performance. We used a long short-term memory network to implement a wavelength-flexible network architecture and proposed the Jensen-Shannon divergence to predict the most suitable training dataset.
RESULTS: The network architecture can flexibly handle the input wavelengths and outperforms linear unmixing and the previously proposed learned spectral decoloring method. Small changes in the training data significantly affect the accuracy of our method, but we find that the Jensen-Shannon divergence correlates with the estimation error and is thus suitable for predicting the most appropriate training datasets for any given application.
CONCLUSIONS: A flexible data-driven network architecture combined with the Jensen-Shannon divergence to predict the best training data set provides a promising direction that might enable robust data-driven photoacoustic oximetry for clinical use cases.
PMID:38841431 | PMC:PMC11151660 | DOI:10.1117/1.JBO.29.S3.S33303
Multi-sequence generative adversarial network: better generation for enhanced magnetic resonance imaging images
Front Comput Neurosci. 2024 May 22;18:1365238. doi: 10.3389/fncom.2024.1365238. eCollection 2024.
ABSTRACT
INTRODUCTION: MRI is one of the commonly used diagnostic methods in clinical practice, especially in brain diseases. There are many sequences in MRI, but T1CE images can only be obtained by using contrast agents. Many patients (such as cancer patients) must undergo alignment of multiple MRI sequences for diagnosis, especially the contrast-enhanced magnetic resonance sequence. However, some patients such as pregnant women, children, etc. find it difficult to use contrast agents to obtain enhanced sequences, and contrast agents have many adverse reactions, which can pose a significant risk. With the continuous development of deep learning, the emergence of generative adversarial networks makes it possible to extract features from one type of image to generate another type of image.
METHODS: We propose a generative adversarial network model with multimodal inputs and end-to-end decoding based on the pix2pix model. For the pix2pix model, we used four evaluation metrics: NMSE, RMSE, SSIM, and PNSR to assess the effectiveness of our generated model.
RESULTS: Through statistical analysis, we compared our proposed new model with pix2pix and found significant differences between the two. Our model outperformed pix2pix, with higher SSIM and PNSR, lower NMSE and RMSE. We also found that the input of T1W images and T2W images had better effects than other combinations, providing new ideas for subsequent work on generating magnetic resonance enhancement sequence images. By using our model, it is possible to generate magnetic resonance enhanced sequence images based on magnetic resonance non-enhanced sequence images.
DISCUSSION: This has significant implications as it can greatly reduce the use of contrast agents to protect populations such as pregnant women and children who are contraindicated for contrast agents. Additionally, contrast agents are relatively expensive, and this generation method may bring about substantial economic benefits.
PMID:38841427 | PMC:PMC11151883 | DOI:10.3389/fncom.2024.1365238
AMP-RNNpro: a two-stage approach for identification of antimicrobials using probabilistic features
Sci Rep. 2024 Jun 5;14(1):12892. doi: 10.1038/s41598-024-63461-6.
ABSTRACT
Antimicrobials are molecules that prevent the formation of microorganisms such as bacteria, viruses, fungi, and parasites. The necessity to detect antimicrobial peptides (AMPs) using machine learning and deep learning arises from the need for efficiency to accelerate the discovery of AMPs, and contribute to developing effective antimicrobial therapies, especially in the face of increasing antibiotic resistance. This study introduced AMP-RNNpro based on Recurrent Neural Network (RNN), an innovative model for detecting AMPs, which was designed with eight feature encoding methods that are selected according to four criteria: amino acid compositional, grouped amino acid compositional, autocorrelation, and pseudo-amino acid compositional to represent the protein sequences for efficient identification of AMPs. In our framework, two-stage predictions have been conducted. Initially, this study analyzed 33 models on these feature extractions. Then, we selected the best six models from these models using rigorous performance metrics. In the second stage, probabilistic features have been generated from the selected six models in each feature encoding and they are aggregated to be fed into our final meta-model called AMP-RNNpro. This study also introduced 20 features with SHAP, which are crucial in the drug development fields, where we discover AAC, ASDC, and CKSAAGP features are highly impactful for detection and drug discovery. Our proposed framework, AMP-RNNpro excels in the identification of novel Amps with 97.15% accuracy, 96.48% sensitivity, and 97.87% specificity. We built a user-friendly website for demonstrating the accurate prediction of AMPs based on the proposed approach which can be accessed at http://13.126.159.30/ .
PMID:38839785 | DOI:10.1038/s41598-024-63461-6
Artificial intelligence applied to laparoscopic cholecystectomy: what is the next step? A narrative review
Updates Surg. 2024 Jun 5. doi: 10.1007/s13304-024-01892-6. Online ahead of print.
ABSTRACT
Artificial Intelligence (AI) is playing an increasing role in several fields of medicine. AI is also used during laparoscopic cholecystectomy (LC) surgeries. In the literature, there is no review that groups together the various fields of application of AI applied to LC. The aim of this review is to describe the use of AI in these contexts. We performed a narrative literature review by searching PubMed, Web of Science, Scopus and Embase for all studies on AI applied to LC, published from January 01, 2010, to December 30, 2023. Our focus was on randomized controlled trials (RCTs), meta-analysis, systematic reviews, and observational studies, dealing with large cohorts of patients. We then gathered further relevant studies from the reference list of the selected publications. Based on the studies reviewed, it emerges that AI could strongly improve surgical efficiency and accuracy during LC. Future prospects include speeding up, implementing, and improving the automaticity with which AI recognizes, differentiates and classifies the phases of the surgical intervention and the anatomic structures that are safe and those at risk.
PMID:38839723 | DOI:10.1007/s13304-024-01892-6
Enhancing Skin Cancer Diagnosis Using Swin Transformer with Hybrid Shifted Window-Based Multi-head Self-attention and SwiGLU-Based MLP
J Imaging Inform Med. 2024 Jun 5. doi: 10.1007/s10278-024-01140-8. Online ahead of print.
ABSTRACT
Skin cancer is one of the most frequently occurring cancers worldwide, and early detection is crucial for effective treatment. Dermatologists often face challenges such as heavy data demands, potential human errors, and strict time limits, which can negatively affect diagnostic outcomes. Deep learning-based diagnostic systems offer quick, accurate testing and enhanced research capabilities, providing significant support to dermatologists. In this study, we enhanced the Swin Transformer architecture by implementing the hybrid shifted window-based multi-head self-attention (HSW-MSA) in place of the conventional shifted window-based multi-head self-attention (SW-MSA). This adjustment enables the model to more efficiently process areas of skin cancer overlap, capture finer details, and manage long-range dependencies, while maintaining memory usage and computational efficiency during training. Additionally, the study replaces the standard multi-layer perceptron (MLP) in the Swin Transformer with a SwiGLU-based MLP, an upgraded version of the gated linear unit (GLU) module, to achieve higher accuracy, faster training speeds, and better parameter efficiency. The modified Swin model-base was evaluated using the publicly accessible ISIC 2019 skin dataset with eight classes and was compared against popular convolutional neural networks (CNNs) and cutting-edge vision transformer (ViT) models. In an exhaustive assessment on the unseen test dataset, the proposed Swin-Base model demonstrated exceptional performance, achieving an accuracy of 89.36%, a recall of 85.13%, a precision of 88.22%, and an F1-score of 86.65%, surpassing all previously reported research and deep learning models documented in the literature.
PMID:38839675 | DOI:10.1007/s10278-024-01140-8
Deep Learning Models of Multi-Scale Lesion Perception Attention Networks for Diagnosis and Staging of Pneumoconiosis: A Comparative Study with Radiologists
J Imaging Inform Med. 2024 Jun 5. doi: 10.1007/s10278-024-01125-7. Online ahead of print.
ABSTRACT
Accurate prediction of pneumoconiosis is essential for individualized early prevention and treatment. However, the different manifestations and high heterogeneity among radiologists make it difficult to diagnose and stage pneumoconiosis accurately. Here, based on DR images collected from two centers, a novel deep learning model, namely Multi-scale Lesion-aware Attention Networks (MLANet), is proposed for diagnosis of pneumoconiosis, staging of pneumoconiosis, and screening of stage I pneumoconiosis. A series of indicators including area under the receiver operating characteristic curve, accuracy, recall, precision, and F1 score were used to comprehensively evaluate the performance of the model. The results show that the MLANet model can effectively improve the consistency and efficiency of pneumoconiosis diagnosis. The accuracy of the MLANet model for pneumoconiosis diagnosis on the internal test set, external validation set, and prospective test set reached 97.87%, 98.03%, and 95.40%, respectively, which was close to the level of qualified radiologists. Moreover, the model can effectively screen stage I pneumoconiosis with an accuracy of 97.16%, a recall of 98.25, a precision of 93.42%, and an F1 score of 95.59%, respectively. The built model performs better than the other four classification models. It is expected to be applied in clinical work to realize the automated diagnosis of pneumoconiosis digital chest radiographs, which is of great significance for individualized early prevention and treatment.
PMID:38839674 | DOI:10.1007/s10278-024-01125-7
DSTAN: A Deformable Spatial-temporal Attention Network with Bidirectional Sequence Feature Refinement for Speckle Noise Removal in Thyroid Ultrasound Video
J Imaging Inform Med. 2024 Jun 5. doi: 10.1007/s10278-023-00935-5. Online ahead of print.
ABSTRACT
Thyroid ultrasound video provides significant value for thyroid diseases diagnosis, but the ultrasound imaging process is often affected by the speckle noise, resulting in poor quality of the ultrasound video. Numerous video denoising methods have been proposed to remove noise while preserving texture details. However, existing methods still suffer from the following problems: (1) relevant temporal features in the low-contrast ultrasound video cannot be accurately aligned and effectively aggregated by simple optical flow or motion estimation, resulting in the artifacts and motion blur in the video; (2) fixed receptive field in spatial features integration lacks the flexibility of aggregating features in the global region of interest and is susceptible to interference from irrelevant noisy regions. In this work, we propose a deformable spatial-temporal attention denoising network to remove speckle noise in thyroid ultrasound video. The entire network follows the bidirectional feature propagation mechanism to efficiently exploit the spatial-temporal information of the whole video sequence. In this process, two modules are proposed to address the above problems: (1) a deformable temporal attention module (DTAM) is designed after optical flow pre-alignment to further capture and aggregate relevant temporal features according to the learned offsets between frames, so that inter-frame information can be better exploited even with the imprecise flow estimation under the low contrast of ultrasound video; (2) a deformable spatial attention module (DSAM) is proposed to flexibly integrate spatial features in the global region of interest through the learned intra-frame offsets, so that irrelevant noisy information can be ignored and essential information can be precisely exploited. Finally, all these refined features are rectified and merged through residual convolution blocks to recover the clean video frames. Experimental results on our thyroid ultrasound video (US-V) dataset and the DDTI dataset demonstrate that our proposed method exceeds 1.2 ∼ 1.3 dB on PSNR and has clearer texture detail compared to other state-of-the-art methods. In the meantime, the proposed model can also assist thyroid nodule segmentation methods to achieve more accurate segmentation effect, which provides an important basis for thyroid diagnosis. In the future, the proposed model can be improved and extended to other medical image sequence datasets, including CT and MRI slice denoising. The code and datasets are provided at https://github.com/Meta-MJ/DSTAN .
PMID:38839673 | DOI:10.1007/s10278-023-00935-5
[<sup>18</sup>F]FDG PET integrated with structural MRI for accurate brain age prediction
Eur J Nucl Med Mol Imaging. 2024 Jun 6. doi: 10.1007/s00259-024-06784-w. Online ahead of print.
ABSTRACT
PURPOSE: Brain aging is a complex and heterogeneous process characterized by both structural and functional decline. This study aimed to establish a novel deep learning (DL) method for predicting brain age by utilizing structural and metabolic imaging data.
METHODS: The dataset comprised participants from both the Universal Medical Imaging Diagnostic Center (UMIDC) and the Alzheimer's Disease Neuroimaging Initiative (ADNI). The former recruited 395 normal control (NC) subjects, while the latter included 438 NC subjects, 51 mild cognitive impairment (MCI) subjects, and 56 Alzheimer's disease (AD) subjects. We developed a novel dual-pathway, 3D simple fully convolutional network (Dual-SFCNeXt) to estimate brain age using [18F]fluorodeoxyglucose positron emission tomography ([18F]FDG PET) and structural magnetic resonance imaging (sMRI) images of NC subjects as input. Several prevailing DL models were trained and tested using either MRI or PET data for comparison. Model accuracies were evaluated using mean absolute error (MAE) and Pearson's correlation coefficient (r). Brain age gap (BAG), deviations of brain age from chronologic age, was correlated with cognitive assessments in MCI and AD subjects.
RESULTS: Both PET- and MRI-based models achieved high prediction accuracy. The leading model was the SFCNeXt (the single-pathway version) for PET (MAE = 2.92, r = 0.96) and MRI (MAE = 3.23, r = 0.95) on all samples. By integrating both PET and MRI images, the Dual-SFCNeXt demonstrated significantly improved accuracy (MAE = 2.37, r = 0.97) compared to all single-modality models. Significantly higher BAG was observed in both the AD (P < 0.0001) and MCI (P < 0.0001) groups compared to the NC group. BAG correlated significantly with Mini-Mental State Examination (MMSE) scores (r=-0.390 for AD, r=-0.436 for MCI) and the Clinical Dementia Rating Scale Sum of Boxes (CDR-SB) scores (r = 0.333 for AD, r = 0.372 for MCI).
CONCLUSION: The integration of [18F]FDG PET with structural MRI enhances the accuracy of brain age prediction, potentially introducing a new avenue for related multimodal brain age prediction studies.
PMID:38839623 | DOI:10.1007/s00259-024-06784-w
Improving intracranial aneurysms image quality and diagnostic confidence with deep learning reconstruction in craniocervical CT angiography
Acta Radiol. 2024 Jun 5:2841851241258220. doi: 10.1177/02841851241258220. Online ahead of print.
ABSTRACT
BACKGROUND: The diagnostic impact of deep learning computed tomography (CT) reconstruction on intracranial aneurysm (IA) remains unclear.
PURPOSE: To quantify the image quality and diagnostic confidence on IA in craniocervical CT angiography (CTA) reconstructed with DEep Learning Trained Algorithm (DELTA) compared to the routine hybrid iterative reconstruction (HIR).
MATERIAL AND METHODS: A total of 60 patients who underwent craniocervical CTA and were diagnosed with IA were retrospectively enrolled. Images were reconstructed with DELTA and HIR, where the image quality was first compared in noise, signal-to-noise ratio (SNR), and contrast-to-noise ratio (CNR). Next, two radiologists independently graded the noise appearance, arterial sharpness, small vessel visibility, conspicuity of calcifications that may present in arteries, and overall image quality, each with a 5-point Likert scale. The diagnostic confidence on IAs of various sizes was also graded.
RESULTS: Significantly lower noise and higher SNR and CNR were found on DELTA than on HIR images (all P < 0.05). All five subjective metrics were scored higher by both readers on the DELTA images (all P < 0.05), with good to excellent inter-observer agreement (κ = 0.77-0.93). DELTA images were rated with higher diagnostic confidence on IAs compared to HIR (P < 0.001), particularly for those with size ≤3 mm, which were scored 4.5 ± 0.6 versus 3.4 ± 0.8 and 4.4 ± 0.7 versus 3.5 ± 0.8 by two readers, respectively.
CONCLUSION: The DELTA shows potential for improving the image quality and the associated confidence in diagnosing IA that may be worth consideration for routine craniocervical CTA applications.
PMID:38839094 | DOI:10.1177/02841851241258220
A joint ESTRO and AAPM guideline for development, clinical validation and reporting of artificial intelligence models in radiation therapy
Radiother Oncol. 2024 Jun 3:110345. doi: 10.1016/j.radonc.2024.110345. Online ahead of print.
ABSTRACT
BACKGROUND AND PURPOSE: Artificial Intelligence (AI) models in radiation therapy are being developed with increasing pace. Despite this, the radiation therapy community has not widely adopted these models in clinical practice. A cohesive guideline on how to develop, report and clinically validate AI algorithms might help bridge this gap.
METHODS AND MATERIALS: A Delphi process with all co-authors was followed to determine which topics should be addressed in this comprehensive guideline. Separate sections of the guideline, including Statements, were written by subgroups of the authors and discussed with the whole group at several meetings. Statements were formulated and scored as highly recommended or recommended.
RESULTS: The following topics were found most relevant: Decision making, image analysis, volume segmentation, treatment planning, patient specific quality assurance of treatment delivery, adaptive treatment, outcome prediction, training, validation and testing of AI model parameters, model availability for others to verify, model quality assurance/updates and upgrades, ethics. Key references were given together with an outlook on current hurdles and possibilities to overcome these. 19 Statements were formulated.
CONCLUSION: A cohesive guideline has been written which addresses main topics regarding AI in radiation therapy. It will help to guide development, as well as transparent and consistent reporting and validation of new AI tools and facilitate adoption.
PMID:38838989 | DOI:10.1016/j.radonc.2024.110345
Adoption of Deep Learning-based Magnetic Resonance Image Information Diagnosis in Brain Function Network Analysis of Parkinson's Disease Patients with End-of-dose Wearing-off
J Neurosci Methods. 2024 Jun 3:110184. doi: 10.1016/j.jneumeth.2024.110184. Online ahead of print.
ABSTRACT
OBJECTIVE: this study was to analyze the brain functional network of end-of-dose wearing-off (EODWO) in patients with Parkinson's disease (PD) using a convolutional neural network (CNN)-based functional magnetic resonance imaging (fMRI) data classification model.
METHODS: one hundred PD patients were recruited and assigned to control (Ctrl) group (39 cases without EODWO) and experimental (Exp) group (61 cases with EODWO). The data classification model based on a CNN was employed to assist the analysis of the changes in brain functional network structure in the two groups. The CNN-based fMRI data classification model was primarily based on a CNN architecture, with improvements made to the initialization of convolutional kernel parameters. Firstly, a structure based on restricted Boltzmann machine (RBM) was constructed, followed by the initialization of convolutional kernel parameters. Subsequently, the model underwent training. Utilizing the data analysis module within the GRETNA toolbox, extracted feature sets were analyzed, including local measures such as betweenness centrality (BC) and degree centrality (DC), as well as global measures such as global efficiency (Eg) and local efficiency (Eloc).
RESULTS: as sparsity increased, there was a gradual upward trend observed in Eg; however, the values of Eg in both brain functional networks remained relatively stable within the range of 0.2 to 0.5. The Eg value of the Ctrl group's whole-brain functional network was 0.17±0.02, while that of the Exp group's whole-brain functional network was 0.17±0.03, with no significant difference between them (P>0.05). The functional DC value of the superior frontal gyrus in the Exp group (8.71±2.56) was significantly lower than that of the Ctrl group (13.32±3.22), whereas the functional DC value of the anterior cingulate gyrus in the Exp group (19.33±4.78) was significantly higher than that of the Ctrl group (15.21±4.02) (P<0.05). There was no significant correlation observed between the functional DC value and levodopa or dopamine agonist therapy (DDT) in the Exp group, whereas the Ctrl group exhibited a significant positive correlation.
CONCLUSION: analysis conducted via a CNN-based fMRI data classification model revealed a correlation between the occurrence of EODWO in PD patients and functional impairments in the left precuneus. Additionally, the occurrence of EODWO may potentially diminish the plasticity of the central prefrontal dopamine.
PMID:38838748 | DOI:10.1016/j.jneumeth.2024.110184
Paired conditional generative adversarial network for highly accelerated liver 4D MRI
Phys Med Biol. 2024 Jun 5. doi: 10.1088/1361-6560/ad5489. Online ahead of print.
ABSTRACT
4D MRI with high spatiotemporal resolution is desired for image-guided liver radiotherapy. Acquiring densely sampled k-space data is time-consuming. Accelerated acquisition with sparse sampling is desirable but often causes degraded image quality or long reconstruction time. We propose the Reconstruct Paired Conditional Generative Adversarial Network (Re-Con-GAN) for shortening the 4D MRI reconstruction time while maintaining the reconstruction quality. 

Methods: Patients underwent free-breathing liver 4D MRI were included in the study. Fully- and retrospectively under-sampled data at 3, 6 and 10 times (3x, 6x and 10x) were first reconstructed using the nuFFT algorithm. Re-Con-GAN then trained input and output in pairs. Three types of networks, ResNet9, UNet and reconstruction swin transformer, were explored as generators. PatchGAN was selected as the discriminator. Re-Con-GAN processed the data (3D+t) as temporal slices (2D+t). A total of 48 patients with 12332 temporal slices were split into training (37 patients with 10721 slices) and test (11 patients with 1611 slices). Compressed sensing (CS) reconstruction with spatiotemporal sparsity constraint was used as a benchmark. Reconstructed image quality was further evaluated with a liver gross tumor volume (GTV) localization task using Mask-RCNN trained from a separate 3D static liver MRI dataset (70 patients; 103 GTV contours).

Results: Re-Con-GAN consistently achieved comparable/better PSNR, SSIM, and RMSE scores compared to CS/UNet models. The inference time of Re-Con-GAN, UNet and CS are 0.15s, 0.16s, and 120s. The GTV detection task showed that Re-Con-GAN and CS, compared to UNet, better improved the dice score (3x Re-Con-GAN 80.98%; 3x CS 80.74%; 3x UNet 79.88%) of unprocessed under-sampled images (3x 69.61%). 

Conclusion: A generative network with adversarial training is proposed with promising and efficient reconstruction results demonstrated on an in-house dataset. The rapid and qualitative reconstruction of 4D liver MR has the potential to facilitate online adaptive MR-guided radiotherapy for liver cancer.
PMID:38838679 | DOI:10.1088/1361-6560/ad5489