Deep learning
A universal immunohistochemistry analyzer for generalizing AI-driven assessment of immunohistochemistry across immunostains and cancer types
NPJ Precis Oncol. 2024 Dec 3;8(1):277. doi: 10.1038/s41698-024-00770-z.
ABSTRACT
Immunohistochemistry (IHC) is the common companion diagnostics in targeted therapies. However, quantifying protein expressions in IHC images present a significant challenge, due to variability in manual scoring and inherent subjective interpretation. Deep learning (DL) offers a promising approach to address these issues, though current models require extensive training for each cancer and IHC type, limiting the practical application. We developed a Universal IHC (UIHC) analyzer, a DL-based tool that quantifies protein expression across different cancers and IHC types. This multi-cohort trained model outperformed conventional single-cohort models in analyzing unseen IHC images (Kappa score 0.578 vs. up to 0.509) and demonstrated consistent performance across varying positive staining cutoff values. In a discovery application, the UIHC model assigned higher tumor proportion scores to MET amplification cases, but not MET exon 14 splicing or other non-small cell lung cancer cases. This UIHC model represents a novel role for DL that further advances quantitative analysis of IHC.
PMID:39627299 | DOI:10.1038/s41698-024-00770-z
Correction: Comprehensive Symptom Prediction in Inpatients With Acute Psychiatric Disorders Using Wearable-Based Deep Learning Models: Development and Validation Study
J Med Internet Res. 2024 Dec 3;26:e69042. doi: 10.2196/69042.
ABSTRACT
[This corrects the article DOI: 10.2196/65994.].
PMID:39626223 | DOI:10.2196/69042
Multimodal multiphasic pre-operative image-based deep-learning predicts hepatocellular carcinoma outcomes after curative surgery
Hepatology. 2024 Dec 2. doi: 10.1097/HEP.0000000000001180. Online ahead of print.
ABSTRACT
BACKGROUND: Hepatocellular carcinoma (HCC) recurrence frequently occurs after curative surgery. Histological microvascular-invasion (MVI) predicts recurrence but cannot provide pre-operative prognostication, whereas clinical prediction scores have variable performances.
METHODS: Recurr-NET, a multimodal multiphasic residual-network random survival forest deep-learning model incorporating pre-operative CT and clinical parameters, was developed to predict HCC recurrence. Pre-operative triphasic CT scans were retrieved from patients with resected histology-confirmed HCC from four centers in Hong Kong (Internal-cohort). The internal-cohort was randomly divided in an 8:2 ratio into training and internal-validation. External-testing was performed in an independent cohort from Taiwan.
RESULTS: Among 1231 patients (Age 62.4, 83.1% male, 86.8% viral hepatitis, median follow-up 65.1 months), cumulative HCC recurrence at years 2 and 5 were 41.8% and 56.4% respectively. Recurr-NET achieved excellent accuracy in predicting recurrence from years 1-5 (Internal cohort AUROC 0.770-0.857; External AUROC 0.758-0.798), significantly out-performing MVI (Internal AUROC 0.518-0.590; External AUROC 0.557-0.615) and multiple clinical risk scores (ERASL-PRE, ERASL-POST, DFT, and Shim scores) (Internal AUROC 0.523-0.587, External AUROC: 0.524-0.620) respectively (all p<0.001). Recurr-NET was superior to MVI in stratifying recurrence risks at year 2 (Internal: 72.5% vs. 50.0% in MVI; External: 65.3% vs. 46.6% in MVI) and year 5 (Internal: 86.4% vs. 62.5% in MVI; External: 81.4% vs. 63.8% in MVI) (all p<0.001). Recurr-NET was also superior to MVI in stratifying liver-related and all-cause mortality (all p<0.001). The performance of Recurr-NET remained robust in subgroup analyses.
CONCLUSION: Recurr-NET accurately predicted HCC recurrence, out-performing MVI and clinical prediction scores respectively, highlighting its potential in pre-operative prognostication.
PMID:39626212 | DOI:10.1097/HEP.0000000000001180
Using a flipped classroom teaching and learning approach to promote scientific literacy skill development and retention
FEBS Open Bio. 2024 Dec 3. doi: 10.1002/2211-5463.13938. Online ahead of print.
ABSTRACT
The development of scientific literacy (SL) skills is critical in the life sciences. A flipped classroom reverses traditional learning spaces such that foundational knowledge is acquired by students independently through recorded lectures and/or readings in advance of the lecture period and knowledge is consolidated through active learning activities in the classroom. A flipped classroom learning environment can promote critical skill development and knowledge application, and therefore, could enhance SL skill development. The objectives here were to (a) determine the effect of a flipped classroom learning environment on SL skill development in second-year kinesiology students enrolled in a research methods course and (b) reassess SL skills 4 months later. SL skills were assessed using the validated test of scientific literacy skills (TOSLS) questionnaire at the start and end of the semester (n = 57) and reassessed 4 months later after the summer semester break (n = 46). During the flipped classroom semester, practical SL skills (TOSLS scores) were increased by 16.3% and TOSLS scores were positively correlated with the students' final grade (r = 0.526, P < 0.001). Four months later, average TOSLS scores significantly decreased compared to the levels at the end of the flipped classroom learning experience. Importantly, retention of SL skills (i.e., 4 months later TOSLS scores) were related to learning approach scores and were positively correlated with deep learning approach scores (r = 0.298, P = 0.044) and negatively correlated with surface learning approach scores (r = -0.314, P = 0.034). Therefore, SL skill retention was higher in students utilizing a deep learning approach (e.g., engaged, self-regulation in learning, and seeking a deeper understanding of concepts) and lower in students utilizing a surface learning approach (e.g., limited engagement, rote memorization of concepts). Collectively, the results demonstrate the value of a flipped classroom in promoting SL skills while highlighting the role of students' learning approach in critical skill retention.
PMID:39625998 | DOI:10.1002/2211-5463.13938
A fact based analysis of decision trees for improving reliability in cloud computing
PLoS One. 2024 Dec 3;19(12):e0311089. doi: 10.1371/journal.pone.0311089. eCollection 2024.
ABSTRACT
The popularity of cloud computing (CC) has increased significantly in recent years due to its cost-effectiveness and simplified resource allocation. Owing to the exponential rise of cloud computing in the past decade, many corporations and businesses have moved to the cloud to ensure accessibility, scalability, and transparency. The proposed research involves comparing the accuracy and fault prediction of five machine learning algorithms: AdaBoostM1, Bagging, Decision Tree (J48), Deep Learning (Dl4jMLP), and Naive Bayes Tree (NB Tree). The results from secondary data analysis indicate that the Central Processing Unit CPU-Mem Multi classifier has the highest accuracy percentage and the least amount of fault prediction. This holds for the Decision Tree (J48) classifier with an accuracy rate of 89.71% for 80/20, 90.28% for 70/30, and 92.82% for 10-fold cross-validation. Additionally, the Hard Disk Drive HDD-Mono classifier has an accuracy rate of 90.35% for 80/20, 92.35% for 70/30, and 90.49% for 10-fold cross-validation. The AdaBoostM1 classifier was found to have the highest accuracy percentage and the least amount of fault prediction for the HDD Multi classifier with an accuracy rate of 93.63% for 80/20, 90.09% for 70/30, and 88.92% for 10-fold cross-validation. Finally, the CPU-Mem Mono classifier has an accuracy rate of 77.87% for 80/20, 77.01% for 70/30, and 77.06% for 10-fold cross-validation. Based on the primary data results, the Naive Bayes Tree (NB Tree) classifier is found to have the highest accuracy rate with less fault prediction of 97.05% for 80/20, 96.09% for 70/30, and 96.78% for 10 folds cross-validation. However, the algorithm complexity is not good, taking 1.01 seconds. On the other hand, the Decision Tree (J48) has the second-highest accuracy rate of 96.78%, 95.95%, and 96.78% for 80/20, 70/30, and 10-fold cross-validation, respectively. J48 also has less fault prediction but with a good algorithm complexity of 0.11 seconds. The difference in accuracy and less fault prediction between NB Tree and J48 is only 0.9%, but the difference in time complexity is 9 seconds. Based on the results, we have decided to make modifications to the Decision Tree (J48) algorithm. This method has been proposed as it offers the highest accuracy and less fault prediction errors, with 97.05% accuracy for the 80/20 split, 96.42% for the 70/30 split, and 97.07% for the 10-fold cross-validation.
PMID:39625991 | DOI:10.1371/journal.pone.0311089
Structural comparison of homologous protein-RNA interfaces reveals widespread overall conservation contrasted with versatility in polar contacts
PLoS Comput Biol. 2024 Dec 3;20(12):e1012650. doi: 10.1371/journal.pcbi.1012650. Online ahead of print.
ABSTRACT
Protein-RNA interactions play a critical role in many cellular processes and pathologies. However, experimental determination of protein-RNA structures is still challenging, therefore computational tools are needed for the prediction of protein-RNA interfaces. Although evolutionary pressures can be exploited for structural prediction of protein-protein interfaces, and recent deep learning methods using protein multiple sequence alignments have radically improved the performance of protein-protein interface structural prediction, protein-RNA structural prediction is lagging behind, due to the scarcity of structural data and the flexibility involved in these complexes. To study the evolution of protein-RNA interface structures, we first identified a large and diverse dataset of 2,022 pairs of structurally homologous interfaces (termed structural interologs). We leveraged this unique dataset to analyze the conservation of interface contacts among structural interologs based on the properties of involved amino acids and nucleotides. We uncovered that 73% of distance-based contacts and 68% of apolar contacts are conserved on average, and the strong conservation of these contacts occurs even in distant homologs with sequence identity below 20%. Distance-based contacts are also much more conserved compared to what we had found in a previous study of homologous protein-protein interfaces. In contrast, hydrogen bonds, salt bridges, and π-stacking interactions are very versatile in pairs of protein-RNA interologs, even for close homologs with high interface sequence identity. We found that almost half of the non-conserved distance-based contacts are linked to a small proportion of interface residues that no longer make interface contacts in the interolog, a phenomenon we term "interface switching out". We also examined possible recovery mechanisms for non-conserved hydrogen bonds and salt bridges, uncovering diverse scenarios of switching out, change in amino acid chemical nature, intermolecular and intramolecular compensations. Our findings provide insights for integrating evolutionary signals into predictive protein-RNA structural modeling methods.
PMID:39625988 | DOI:10.1371/journal.pcbi.1012650
DLLabelsCT: Annotation tool using deep transfer learning to assist in creating new datasets from abdominal computed tomography scans, case study: Pancreas
PLoS One. 2024 Dec 3;19(12):e0313126. doi: 10.1371/journal.pone.0313126. eCollection 2024.
ABSTRACT
The utilization of artificial intelligence (AI) is expanding significantly within medical research and, to some extent, in clinical practice. Deep learning (DL) applications, which use large convolutional neural networks (CNN), hold considerable potential, especially in optimizing radiological evaluations. However, training DL algorithms to clinical standards requires extensive datasets, and their processing is labor-intensive. In this study, we developed an annotation tool named DLLabelsCT that utilizes CNN models to accelerate the image analysis process. To validate DLLabelsCT, we trained a CNN model with a ResNet34 encoder and a UNet decoder to segment the pancreas on an open-access dataset and used the DL model to assist in annotating a local dataset, which was further used to refine the model. DLLabelsCT was also tested on two external testing datasets. The tool accelerates annotation by 3.4 times compared to a completely manual annotation method. Out of 3,715 CT scan slices in the testing datasets, 50% did not require editing when reviewing the segmentations made by the ResNet34-UNet model, and the mean and standard deviation of the Dice similarity coefficient was 0.82±0.24. DLLabelsCT is highly accurate and significantly saves time and resources. Furthermore, it can be easily modified to support other deep learning models for other organs, making it an efficient tool for future research involving larger datasets.
PMID:39625972 | DOI:10.1371/journal.pone.0313126
Liver tumor segmentation method combining multi-axis attention and conditional generative adversarial networks
PLoS One. 2024 Dec 3;19(12):e0312105. doi: 10.1371/journal.pone.0312105. eCollection 2024.
ABSTRACT
In modern medical imaging-assisted therapies, manual annotation is commonly employed for liver and tumor segmentation in abdominal CT images. However, this approach suffers from low efficiency and poor accuracy. With the development of deep learning, automatic liver tumor segmentation algorithms based on neural networks have emerged, for the improvement of the work efficiency. However, existing liver tumor segmentation algorithms still have several limitations: (1) they often encounter the common issue of class imbalance in liver tumor segmentation tasks, where the tumor region is significantly smaller than the normal tissue region, causing models to predict more negative samples and neglect the tumor region; (2) they fail to adequately consider feature fusion between global contexts, leading to the loss of crucial information; (3) they exhibit weak perception of local details such as fuzzy boundaries, irregular shapes, and small lesions, thereby failing to capture important features. To address these issues, we propose a Multi-Axis Attention Conditional Generative Adversarial Network, referred to as MA-cGAN. Firstly, we propose the Multi-Axis attention mechanism (MA) that projects three-dimensional CT images along different axes to extract two-dimensional features. The features from different axes are then fused by using learnable factors to capture key information from different directions. Secondly, the MA is incorporated into a U-shaped segmentation network as the generator to enhance its ability to extract detailed features. Thirdly, a conditional generative adversarial network is built by combining a discriminator and a generator to enhance the stability and accuracy of the generator's segmentation results. The MA-cGAN was trained and tested on the LiTS public dataset for the liver and tumor segmentation challenge. Experimental results show that MA-cGAN improves the Dice coefficient, Hausdorff distance, average surface distance, and other metrics compared to the state-of-the-art segmentation models. The segmented liver and tumor models have clear edges, fewer false positive regions, and are closer to the true labels, which plays an active role in medical adjuvant therapy. The source code with our proposed model are available at https://github.com/jhliao0525/MA-cGAN.git.
PMID:39625955 | DOI:10.1371/journal.pone.0312105
MetaCONNET: A metagenomic polishing tool for long-read assemblies
PLoS One. 2024 Dec 3;19(12):e0313515. doi: 10.1371/journal.pone.0313515. eCollection 2024.
ABSTRACT
Accurate and high coverage genome assemblies are the basis for downstream analysis of metagenomic studies. Long-read sequencing technology is an ideal tool to facilitate the assemblies of metagenome, except for the drawback of usually producing reads with high sequencing error rate. Many polishing tools were developed to correct the sequencing error, but most are designed on the ground of one or two species. Considering the complexity and uneven depth of metagenomic study, we present a novel deep-learning polishing tool named MetaCONNET for polishing metagenomic assemblies. We evaluate MetaCONNET against Medaka, CONNET and NextPolish in accuracy, coverage, contiguity and resource consumption. Our results demonstrate that MetaCONNET provides a valuable polishing tool and can be applied to many metagenomic studies.
PMID:39625881 | DOI:10.1371/journal.pone.0313515
Estimating the distribution of numerosity and non-numerical visual magnitudes in natural scenes using computer vision
Psychol Res. 2024 Dec 3;89(1):31. doi: 10.1007/s00426-024-02064-2.
ABSTRACT
Humans share with many animal species the ability to perceive and approximately represent the number of objects in visual scenes. This ability improves throughout childhood, suggesting that learning and development play a key role in shaping our number sense. This hypothesis is further supported by computational investigations based on deep learning, which have shown that numerosity perception can spontaneously emerge in neural networks that learn the statistical structure of images with a varying number of items. However, neural network models are usually trained using synthetic datasets that might not faithfully reflect the statistical structure of natural environments, and there is also growing interest in using more ecological visual stimuli to investigate numerosity perception in humans. In this work, we exploit recent advances in computer vision algorithms to design and implement an original pipeline that can be used to estimate the distribution of numerosity and non-numerical magnitudes in large-scale datasets containing thousands of real images depicting objects in daily life situations. We show that in natural visual scenes the frequency of appearance of different numerosities follows a power law distribution. Moreover, we show that the correlational structure for numerosity and continuous magnitudes is stable across datasets and scene types (homogeneous vs. heterogeneous object sets). We suggest that considering such "ecological" pattern of covariance is important to understand the influence of non-numerical visual cues on numerosity judgements.
PMID:39625570 | DOI:10.1007/s00426-024-02064-2
Effects of Physical Activity and Inactivity on Microvasculature in Children: The Hong Kong Children Eye Study
Invest Ophthalmol Vis Sci. 2024 Dec 2;65(14):7. doi: 10.1167/iovs.65.14.7.
ABSTRACT
PURPOSE: The purpose of this study was to investigate the effects of physical activity and inactivity on the microvasculature in children, as measured from retinal photographs.
METHODS: All participants were from the Hong Kong Children Eye Study, a population-based cross-sectional study of children aged 6 to 8 years. They received comprehensive ophthalmic examinations and retinal photography. Their demographics and involvement in physical activity and inactivity were obtained from validated questionnaires. A validated Deep Learning System was used to measure, from retinal photographs, central retinal arteriolar equivalent (CRAE) and central retinal venular equivalent (CRVE).
RESULTS: In the final analysis of 11,959 participants, 6244 (52.2%) were boys and the mean age was 7.55 (1.05) years. Increased ratio of physical activity to inactivity was associated with wider CRAE (β = 1.033, P = 0.007) and narrower CRVE (β = -2.079, P < 0.001). In the subgroup analysis of boys, increased ratio of physical activity to inactivity was associated with wider CRAE (β = 1.364, P = 0.013) and narrower CRVE (β = -2.563, P = 0.001). The subgroup analysis of girls also showed increased ratio of physical activity to inactivity was associated with narrower CRVE (β = -1.759, P = 0.020), but not CRAE.
CONCLUSIONS: Increased activity in children is associated with healthier microvasculature, as shown in the retina. Our study contributes to the growing evidence that physical activity positively influences vascular health from a young age. Therefore, this study also underscores the potential of using the retinal vasculature as a biomarker of cardiovascular health.
PMID:39625440 | DOI:10.1167/iovs.65.14.7
Development of an Open-Source Dataset of Flat-Mounted Images for the Murine Oxygen-Induced Retinopathy Model of Ischemic Retinopathy
Transl Vis Sci Technol. 2024 Dec 2;13(12):4. doi: 10.1167/tvst.13.12.4.
ABSTRACT
PURPOSE: To describe an open-source dataset of flat-mounted retinal images and vessel segmentations from mice subject to the oxygen-induced retinopathy (OIR) model.
METHODS: Flat-mounted retinal images from mice killed at postnatal days 12 (P12), P17, and P25 used in prior OIR studies were compiled. Mice subjected to normoxic conditions were killed at P12, P17, and P25, and their retinas were flat-mounted for imaging. Major blood vessels from the OIR images were manually segmented by four graders (JSC, HKR, KBL, JM), with cross-validation performed to ensure similar grading.
RESULTS: Overall, 1170 images were included in this dataset. Of these images, 111 were of normoxic mice retina, and 1048 were mice subject to OIR. The majority of images from OIR mice were obtained at P17. The 50 images obtained from an external dataset, OIRSeg, did not have age labels. All images were manually segmented and used in the training or testing of a previously published deep learning algorithm.
CONCLUSIONS: This is the first open-source dataset of original and segmented flat-mounted retinal images. The dataset has potential applications for expanding the development of generalizable and larger-scale artificial intelligence and analyses for OIR. This dataset is published online and publicly available at dx.doi.org/10.6084/m9.figshare.23690973.
TRANSLATIONAL RELEVANCE: This open access dataset serves as a source of raw data for future research involving big data and artificial intelligence research concerning oxygen-induced retinopathy.
PMID:39625436 | DOI:10.1167/tvst.13.12.4
Error compensated MOF-based ReRAM array for encrypted logical operations
Dalton Trans. 2024 Dec 3. doi: 10.1039/d4dt02880e. Online ahead of print.
ABSTRACT
Metal-organic frameworks (MOFs) form a unique platform for operation with data using ReRAM technology. Here we report on a large-scale fabrication of a MOF-based ReRAM array with 6 × 6 cells, demonstrating 50% variation in their electronic parameters. Despite this inhomogeneity, such a "non-ideal" ReRAM array is used for recording binary information followed by deep learning processes to achieve 95% accuracy of reading. Next, the same ReRAM array is used to record numbers (from 0 to 15) followed by the operation of addition. For the correct performance of such an analogue algorithm, we determine a set of unique coefficients for each ReRAM cell, allowing us to use this set as an encrypted key to get access to logical operations. The obtained results, thereby, demonstrate the possibility of "non-ideal" MOF-based ReRAM for low error reading of information coupled with encrypted logical operations.
PMID:39625410 | DOI:10.1039/d4dt02880e
Deep Learning for Automated Segmentation of Basal Cell Carcinoma on Mohs Micrographic Surgery Frozen Section Slides
Dermatol Surg. 2024 Dec 3. doi: 10.1097/DSS.0000000000004501. Online ahead of print.
ABSTRACT
BACKGROUND: Deep learning has been used to classify basal cell carcinoma (BCC) on histopathologic images. Segmentation models, required for localization of tumor on Mohs surgery (MMS) frozen section slides, have yet to reach clinical utility.
OBJECTIVE: To train a segmentation model to localize BCC on MMS frozen section slides and to evaluate performance by BCC subtype.
MATERIALS AND METHODS: The study included 348 fresh frozen tissue slides, scanned as whole slide images, from patients treated with MMS for BCC. BCC foci were manually outlined using the Grand Challenge annotation platform. The data set was divided into 80% for training, 10% for validation, and 10% for the test data set. Segmentation was performed using the Ultralytics YOLOv8 model.
RESULTS: Sensitivity was .71 for all tumors, .87 for nodular BCC, .79 for superficial BCC, .74 for micronodular BCC, and .51 for morpheaform and infiltrative BCC. Specificity was .75 for all tumors, .59 for nodular BCC, .58 for superficial BCC, .83 for micronodular BCC, and .74 for morpheaform and infiltrative BCC.
CONCLUSION: This study trained a segmentation model to localize BCC on MMS frozen section slides with reasonably high sensitivity and specificity, and this varied by BCC subtype. More accurate and clinically relevant performance metrics for segmentation studies are needed.
PMID:39625169 | DOI:10.1097/DSS.0000000000004501
Breast radiotherapy planning: A decision-making framework using deep learning
Med Phys. 2024 Dec 3. doi: 10.1002/mp.17527. Online ahead of print.
ABSTRACT
BACKGROUND: Effective breast cancer treatment planning requires balancing tumor control while minimizing radiation exposure to healthy tissues. Choosing between intensity-modulated radiation therapy (IMRT) and three-dimensional conformal radiation therapy (3D-CRT) remains pivotal, influenced by patient anatomy and dosimetric constraints.
PURPOSE: This study aims to develop a decision-making framework utilizing deep learning to predict dose distributions, aiding in the selection of optimal treatment techniques.
METHODS: A 2D U-Net convolutional neural network (CNN) model was used to predict dose distribution maps and dose-volume histogram (DVH) metrics for breast cancer patients undergoing IMRT and 3D-CRT. The model was trained and fine-tuned using retrospective datasets from two medical centers, accounting for variations in CT systems, dosimetric protocols, and clinical practices, over 346 patients. An additional 30 consecutive patients were selected for external validation, where both 3D-CRT and IMRT plans were manually created. To show the potential of the approach, an independent medical physicist evaluated both dosimetric plans and selected the most appropriate one based on applicable clinical criteria. Confusion matrices were used to compare the decisions of the independent observer with the historical decision and the proposed decision-making framework.
RESULTS: Evaluation metrics, including dice similarity coefficients (DSC) and DVH analyses, demonstrated high concordance between predicted and clinical dose distribution for both IMRT and 3D-CRT techniques, especially for organs at risk (OARs). The decision-making framework demonstrated high accuracy (90 % $\%$ ), recall (95.7 % $\%$ ), and precision (91.7 % $\%$ ) when compared to independent clinical evaluations, while the historical decision-making had lower accuracy (50 % $\%$ ), recall (47.8 % $\%$ ), and precision (78.6 % $\%$ ).
CONCLUSIONS: The proposed decision-making model accurately predicts dose distributions for both 3D-CRT and IMRT, ensuring reliable OAR dose estimation. This decision-making framework significantly outperforms historical decision-making, demonstrating higher accuracy, recall, and precision.
PMID:39625151 | DOI:10.1002/mp.17527
Deep learning based super-resolution for CBCT dose reduction in radiotherapy
Med Phys. 2024 Dec 3. doi: 10.1002/mp.17557. Online ahead of print.
ABSTRACT
BACKGROUND: Cone-beam computed tomography (CBCT) is a crucial daily imaging modality in image-guided and adaptive radiotherapy. However, the use of ionizing radiation in CBCT imaging increases the risk of secondary cancers, which is particularly concerning for pediatric patients. Deep learning super-resolution has shown promising results in enhancing the resolution of photographic and medical images but has not yet been explored in the context of CBCT dose reduction.
PURPOSE: To facilitate CBCT imaging dose reduction, we propose using an enhanced super-resolution generative adversarial network (ESRGAN) in both the projection and image domains to restore the image quality of low-dose CBCT.
METHODS: An extensive projection database, containing 2997 CBCT scans from head and neck cancer patients, was used to train two different ESRGAN models to generate super-resolution CBCTs. One model operated in the projection domain, using pairs of simulated low-resolution (low-dose) and original high-resolution (high-dose) projections and yielded CBCTSRpro. The other model operated in the image domain, using pairs of axial slices from reconstructed low-resolution and high-resolution CBCTs (CBCTLR and CBCTHR) and resulted in CBCTSRimg. Super-resolution CBCTs were evaluated in terms of image similarity (MAE, ME, PSNR, and SSIM), noise characteristics, spatial resolution, and registration accuracy, using the original CBCT as a reference. To test the perceptual difference between the original and super-resolution CBCT, we performed a visual Turing test.
RESULTS: Visually, both super-resolution approaches in the projection and image domains improved the image quality of low-dose CBCTs. This was confirmed by the visual Turing test, that showed low accuracy, sensitivity, and specificity, indicating almost no perceptual difference between CBCTHR and the super-resolution CBCTs. CBCTSRimg (accuracy: 0.55, sensitivity: 0.59, specificity: 0.50) performed slightly better than CBCTSRpro (accuracy: 0.59, sensitivity: 0.61, specificity: 0.57). Image similarity metrics were affected by varying noise levels and did not reflect the visual improvements, with MAE/ME/PSNR/SSIM values of 110.4 HU/2.9 HU/40.4 dB/0.82 for CBCTLR, 136.6 HU/-0.4 HU/38.6 dB/0.77 for CBCTSRpro, and 128.2 HU/1.9 HU/39.0 dB/0.80 for CBCTSRimg. In terms of spatial resolution, quantified by calculating 10% levels of the task transfer function, both CBCTSRpro and CBCTSRimg outperformed CBCTLR and nearly matched the reference CBCTHR (CBCTLR: 0.66 lp/mm, CBCTSRpro: 0.88 lp/mm, CBCTSRimg: 0.95 lp/mm, CBCTHR: 1.01 lp/mm). Noise characteristics of CBCTSRimg and CBCTSRpro were comparable to the reference CBCTHR. Registration parameters showed negligible differences for all CBCTs (CBCTLR, CBCTSRpro, CBCTSRimg), with average absolute differences in registration parameters being below 0.4° for rotations and below 0.06 mm for translations (CBCTHR as reference).
CONCLUSIONS: This study demonstrates that deep learning can be a valuable tool for CBCT dose reduction in CBCT-guided radiotherapy by acquiring low-dose CBCTs and restoring the image quality using deep learning super-resolution. The results suggest that higher quality images can be generated when super-resolution is performed in the image domain compared to the projection domain.
PMID:39625126 | DOI:10.1002/mp.17557
Low dose threshold for measuring cardiac functional metrics using four-dimensional CT with deep learning
J Appl Clin Med Phys. 2024 Dec 3:e14593. doi: 10.1002/acm2.14593. Online ahead of print.
ABSTRACT
BACKGROUND: Four-dimensional CT is increasingly used for functional cardiac imaging, including prognosis for conditions such as heart failure and post myocardial infarction. However, radiation dose from an acquisition spanning the full cardiac cycle remains a concern. This work investigates the possibility of dose reduction in 4DCT using deep learning (DL)-based segmentation techniques as an objective observer.
METHODS: A 3D residual U-Net was developed for segmentation of left ventricle (LV) myocardium and blood pool. Two networks were trained: Standard DL (trained with only standard-dose [SD] data) and Noise-Robust DL (additionally trained with low-dose data). The primary goal of the proposed DL methods is to serve as an unbiased and consistent observer for functional analysis performance. Functional cardiac metrics including ejection fraction (EF), global longitudinal strain (GLS), circumferential strain (CS), and wall thickness (WT), were measured for an external test set of 250 Cardiac CT volumes reconstructed at five different dose levels.
RESULTS: Functional metrics obtained from DL segmentations of standard dose images matched well with those from expert manual analysis. Utilizing Standard-DL, absolute difference between DL-derived metrics obtained with standard dose data and 100 mA (corresponding to ∼76 ± 13% dose reduction) data was less than 0.8 ± 1.0% for EF, GLS, and CS, and 5.6 ± 6.7% for Average WT. Performance variation of Noise-Robust DL remained acceptable at even 50 mA.
CONCLUSION: We demonstrate that on average radiation dose can be reduced by a factor of 5 while introducing minimal changes to global functional metrics (especially EF, GLS, and CS). The robustness to reduced image quality can be further boosted by using emulated low-dose data in the DL training set.
PMID:39625106 | DOI:10.1002/acm2.14593
Process-Informed Neural Networks: A Hybrid Modelling Approach to Improve Predictive Performance and Inference of Neural Networks in Ecology and Beyond
Ecol Lett. 2024 Nov;27(11):e70012. doi: 10.1111/ele.70012.
ABSTRACT
Despite deep learning being state of the art for data-driven model predictions, its application in ecology is currently subject to two important constraints: (i) deep-learning methods are powerful in data-rich regimes, but in ecology data are typically sparse; and (ii) deep-learning models are black-box methods and inferring the processes they represent are non-trivial to elicit. Process-based (= mechanistic) models are not constrained by data sparsity or unclear processes and are thus important for building up our ecological knowledge and transfer to applications. In this work, we combine process-based models and neural networks into process-informed neural networks (PINNs), which incorporate the process knowledge directly into the neural network structure. In a systematic evaluation of spatial and temporal prediction tasks for C-fluxes in temperate forests, we show the ability of five different types of PINNs (i) to outperform process-based models and neural networks, especially in data-sparse regimes with high-transfer task and (ii) to inform on mis- or undetected processes.
PMID:39625058 | DOI:10.1111/ele.70012
Efficient generation of HPLC and FTIR data for quality assessment using time series generation model: a case study on Tibetan medicine Shilajit
Front Pharmacol. 2024 Nov 18;15:1503508. doi: 10.3389/fphar.2024.1503508. eCollection 2024.
ABSTRACT
BACKGROUND: The scarcity and preciousness of plateau characteristic medicinal plants pose a significant challenge in obtaining sufficient quantities of experimental samples for quality evaluation. Insufficient sample sizes often lead to ambiguous and questionable quality assessments and suboptimal performance in pattern recognition. Shilajit, a popular Tibetan medicine, is harvested from high altitudes above 2000 m, making it difficult to obtain. Additionally, the complex geographical environment results in low uniformity of Shilajit quality.
METHODS: To address these challenges, this study employed a deep learning model, time vector quantization variational auto- encoder (TimeVQVAE), to generate data matrices based on chromatographic and spectral for different grades of Shilajit, thereby increasing in the amount of data. Partial least squares discriminant analysis (PLS-DA) was used to identify three grades of Shilajit samples based on original, generated, and combined data.
RESULTS: Compared with the originally generated high performance liquid chromatography (HPLC) and Fourier transform infrared spectroscopy (FTIR) data, the data generated by TimeVQVAE effectively preserved the chemical profile. In the test set, the average matrices for HPLC, FTIR, and combined data increased by 32.2%, 15.9%, and 23.0%, respectively. On the real test data, the PLS-DA model's classification accuracy initially reached a maximum of 0.7905. However, after incorporating TimeVQVAE-generated data, the accuracy significantly improved, reaching 0.9442 in the test set. Additionally, the PLS-DA model trained with the fused data showed enhanced stability.
CONCLUSION: This study offers a novel and effective approach for researching medicinal materials with small sample sizes, and addresses the limitations of improving model performance through data augmentation strategies.
PMID:39624838 | PMC:PMC11608951 | DOI:10.3389/fphar.2024.1503508
Retinal Vessel Plexus Differentiation Based on OCT Angiography Using Deep Learning
Ophthalmol Sci. 2024 Aug 23;5(1):100605. doi: 10.1016/j.xops.2024.100605. eCollection 2025 Jan-Feb.
ABSTRACT
PURPOSE: Although structural OCT is traditionally used to differentiate the vascular plexus layers in OCT angiography (OCTA), the vascular plexuses do not always obey the retinal laminations. We sought to segment the superficial, deep, and avascular plexuses from OCTA images using deep learning without structural OCT image input or segmentation boundaries.
DESIGN: Cross-sectional study.
SUBJECTS: The study included 235 OCTA cubes from 33 patients for training and testing of the model.
METHODS: From each OCTA cube, 3 weakly labeled images representing the superficial, deep, and avascular plexuses were obtained for a total of 705 starting images. Images were augmented with standard intensity and geometric transforms, and regions from adjacent plexuses were programmatically combined to create synthetic 2-class images for each OCTA cube. Images were partitioned on a per patient basis into training, validation, and reserved test groups to train and evaluate a U-Net based machine learning model. To investigate the generalization of the model, we applied the model to multiclass thin slabs from OCTA volumes and qualitatively observed the resulting b-scans.
MAIN OUTCOME MEASURES: Plexus segmentation performance was assessed quantitatively using Dice scores on a held-out test set.
RESULTS: After training on single-class plexus images, our model achieved good results (Dice scores > 0.82) and was further improved when using the synthetic 2-class images (Dice >0.95). Although not trained on more complex multiclass slabs, the model performed plexus labeling on slab data, which indicates that the use of only OCTA data shows promise for segmenting the superficial, deep, and avascular plexuses without requiring OCT layer segmentations, and the use of synthetic 2-class images makes a significant performance improvement.
CONCLUSIONS: This study presents the use of OCTA data alone to segment the superficial, deep, and avascular plexuses of the retina, confirming that use of structural OCT layer segmentations as boundaries is not required.
FINANCIAL DISCLOSURES: Proprietary or commercial disclosure may be found in the Footnotes and Disclosures at the end of this article.
PMID:39624795 | PMC:PMC11609517 | DOI:10.1016/j.xops.2024.100605