Deep learning

Enhancing Embedded Object Tracking: A Hardware Acceleration Approach for Real-Time Predictability

Wed, 2024-03-27 06:00

J Imaging. 2024 Mar 13;10(3):70. doi: 10.3390/jimaging10030070.

ABSTRACT

While Siamese object tracking has witnessed significant advancements, its hard real-time behaviour on embedded devices remains inadequately addressed. In many application cases, an embedded implementation should not only have a minimal execution latency, but this latency should ideally also have zero variance, i.e., be predictable. This study aims to address this issue by meticulously analysing real-time predictability across different components of a deep-learning-based video object tracking system. Our detailed experiments not only indicate the superiority of Field-Programmable Gate Array (FPGA) implementations in terms of hard real-time behaviour but also unveil important time predictability bottlenecks. We introduce dedicated hardware accelerators for key processes, focusing on depth-wise cross-correlation and padding operations, utilizing high-level synthesis (HLS). Implemented on a KV260 board, our enhanced tracker exhibits not only a speed up, with a factor of 6.6, in mean execution time but also significant improvements in hard real-time predictability by yielding 11 times less latency variation as compared to our baseline. A subsequent analysis of power consumption reveals our approach's contribution to enhanced power efficiency. These advancements underscore the crucial role of hardware acceleration in realizing time-predictable object tracking on embedded systems, setting new standards for future hardware-software co-design endeavours in this domain.

PMID:38535150 | DOI:10.3390/jimaging10030070

Categories: Literature Watch

Revolutionizing Cow Welfare Monitoring: A Novel Top-View Perspective with Depth Camera-Based Lameness Classification

Wed, 2024-03-27 06:00

J Imaging. 2024 Mar 8;10(3):67. doi: 10.3390/jimaging10030067.

ABSTRACT

This study innovates livestock health management, utilizing a top-view depth camera for accurate cow lameness detection, classification, and precise segmentation through integration with a 3D depth camera and deep learning, distinguishing it from 2D systems. It underscores the importance of early lameness detection in cattle and focuses on extracting depth data from the cow's body, with a specific emphasis on the back region's maximum value. Precise cow detection and tracking are achieved through the Detectron2 framework and Intersection Over Union (IOU) techniques. Across a three-day testing period, with observations conducted twice daily with varying cow populations (ranging from 56 to 64 cows per day), the study consistently achieves an impressive average detection accuracy of 99.94%. Tracking accuracy remains at 99.92% over the same observation period. Subsequently, the research extracts the cow's depth region using binary mask images derived from detection results and original depth images. Feature extraction generates a feature vector based on maximum height measurements from the cow's backbone area. This feature vector is utilized for classification, evaluating three classifiers: Random Forest (RF), K-Nearest Neighbor (KNN), and Decision Tree (DT). The study highlights the potential of top-view depth video cameras for accurate cow lameness detection and classification, with significant implications for livestock health management.

PMID:38535147 | DOI:10.3390/jimaging10030067

Categories: Literature Watch

Historical Text Line Segmentation Using Deep Learning Algorithms: Mask-RCNN against U-Net Networks

Wed, 2024-03-27 06:00

J Imaging. 2024 Mar 5;10(3):65. doi: 10.3390/jimaging10030065.

ABSTRACT

Text line segmentation is a necessary preliminary step before most text transcription algorithms are applied. The leading deep learning networks used in this context (ARU-Net, dhSegment, and Doc-UFCN) are based on the U-Net architecture. They are efficient, but fall under the same concept, requiring a post-processing step to perform instance (e.g., text line) segmentation. In the present work, we test the advantages of Mask-RCNN, which is designed to perform instance segmentation directly. This work is the first to directly compare Mask-RCNN- and U-Net-based networks on text segmentation of historical documents, showing the superiority of the former over the latter. Three studies were conducted, one comparing these networks on different historical databases, another comparing Mask-RCNN with Doc-UFCN on a private historical database, and a third comparing the handwritten text recognition (HTR) performance of the tested networks. The results showed that Mask-RCNN outperformed ARU-Net, dhSegment, and Doc-UFCN using relevant line segmentation metrics, that performance evaluation should not focus on the raw masks generated by the networks, that a light mask processing is an efficient and simple solution to improve evaluation, and that Mask-RCNN leads to better HTR performance.

PMID:38535145 | DOI:10.3390/jimaging10030065

Categories: Literature Watch

Elevating Chest X-ray Image Super-Resolution with Residual Network Enhancement

Wed, 2024-03-27 06:00

J Imaging. 2024 Mar 4;10(3):64. doi: 10.3390/jimaging10030064.

ABSTRACT

Chest X-ray (CXR) imaging plays a pivotal role in diagnosing various pulmonary diseases, which account for a significant portion of the global mortality rate, as recognized by the World Health Organization (WHO). Medical practitioners routinely depend on CXR images to identify anomalies and make critical clinical decisions. Dramatic improvements in super-resolution (SR) have been achieved by applying deep learning techniques. However, some SR methods are very difficult to utilize due to their low-resolution inputs and features containing abundant low-frequency information, similar to the case of X-ray image super-resolution. In this paper, we introduce an advanced deep learning-based SR approach that incorporates the innovative residual-in-residual (RIR) structure to augment the diagnostic potential of CXR imaging. Specifically, we propose forming a light network consisting of residual groups built by residual blocks, with multiple skip connections to facilitate the efficient bypassing of abundant low-frequency information through multiple skip connections. This approach allows the main network to concentrate on learning high-frequency information. In addition, we adopted the dense feature fusion within residual groups and designed high parallel residual blocks for better feature extraction. Our proposed methods exhibit superior performance compared to existing state-of-the-art (SOTA) SR methods, delivering enhanced accuracy and notable visual improvements, as evidenced by our results.

PMID:38535144 | DOI:10.3390/jimaging10030064

Categories: Literature Watch

Pedestrian-Accessible Infrastructure Inventory: Enabling and Assessing Zero-Shot Segmentation on Multi-Mode Geospatial Data for All Pedestrian Types

Wed, 2024-03-27 06:00

J Imaging. 2024 Feb 21;10(3):52. doi: 10.3390/jimaging10030052.

ABSTRACT

In this paper, a Segment Anything Model (SAM)-based pedestrian infrastructure segmentation workflow is designed and optimized, which is capable of efficiently processing multi-sourced geospatial data, including LiDAR data and satellite imagery data. We used an expanded definition of pedestrian infrastructure inventory, which goes beyond the traditional transportation elements to include street furniture objects that are important for accessibility but are often omitted from the traditional definition. Our contributions lie in producing the necessary knowledge to answer the following three questions. First, how can mobile LiDAR technology be leveraged to produce comprehensive pedestrian-accessible infrastructure inventory? Second, which data representation can facilitate zero-shot segmentation of infrastructure objects with SAM? Third, how well does the SAM-based method perform on segmenting pedestrian infrastructure objects? Our proposed method is designed to efficiently create pedestrian-accessible infrastructure inventory through the zero-shot segmentation of multi-sourced geospatial datasets. Through addressing three research questions, we show how the multi-mode data should be prepared, what data representation works best for what asset features, and how SAM performs on these data presentations. Our findings indicate that street-view images generated from mobile LiDAR point-cloud data, when paired with satellite imagery data, can work efficiently with SAM to create a scalable pedestrian infrastructure inventory approach with immediate benefits to GIS professionals, city managers, transportation owners, and walkers, especially those with travel-limiting disabilities, such as individuals who are blind, have low vision, or experience mobility disabilities.

PMID:38535133 | DOI:10.3390/jimaging10030052

Categories: Literature Watch

DeepLabCut-based daily behavioural and posture analysis in a Cricket

Wed, 2024-03-27 06:00

Biol Open. 2024 Mar 27:bio.060237. doi: 10.1242/bio.060237. Online ahead of print.

ABSTRACT

Circadian rhythms are indispensable intrinsic programs that regulate the daily rhythmicity of physiological processes, such as feeding and sleep. The cricket has been employed as a model organism for understanding the neural mechanisms underlying circadian rhythms in insects. However, previous studies measuring rhythm-controlled behaviours only analysed locomotive activity using seesaw-type and infrared sensor-based actometers. Meanwhile, advances in deep learning techniques have made it possible to analyse animal behaviour and posture using software that is devoid of human bias and does not require physical tagging of individual animals. Here, we present a system that can simultaneously quantify multiple behaviours in individual crickets, such as locomotor activity, feeding, and sleep-like states, in the long-term, using DeepLabCut, a supervised machine learning-based software for body keypoints labeling. Our system successfully labelled the six body parts of a single cricket with a high level of confidence and produced reliable data showing the diurnal rhythms of multiple behaviours. Our system also enabled the estimation of sleep-like states by focusing on posture, instead of immobility time, which is a conventional parameter. We anticipate that this system will provide an opportunity for simultaneous and automatic prediction of cricket behaviour and posture, facilitating the study of circadian rhythms.

PMID:38533608 | DOI:10.1242/bio.060237

Categories: Literature Watch

Editorial: Artificial intelligence solutions for decision making in robotics

Wed, 2024-03-27 06:00

Front Robot AI. 2024 Mar 12;11:1389191. doi: 10.3389/frobt.2024.1389191. eCollection 2024.

NO ABSTRACT

PMID:38533526 | PMC:PMC10964767 | DOI:10.3389/frobt.2024.1389191

Categories: Literature Watch

A survey on autonomous environmental monitoring approaches: towards unifying active sensing and reinforcement learning

Wed, 2024-03-27 06:00

Front Robot AI. 2024 Mar 12;11:1336612. doi: 10.3389/frobt.2024.1336612. eCollection 2024.

ABSTRACT

The environmental pollution caused by various sources has escalated the climate crisis making the need to establish reliable, intelligent, and persistent environmental monitoring solutions more crucial than ever. Mobile sensing systems are a popular platform due to their cost-effectiveness and adaptability. However, in practice, operation environments demand highly intelligent and robust systems that can cope with an environment's changing dynamics. To achieve this reinforcement learning has become a popular tool as it facilitates the training of intelligent and robust sensing agents that can handle unknown and extreme conditions. In this paper, a framework that formulates active sensing as a reinforcement learning problem is proposed. This framework allows unification with multiple essential environmental monitoring tasks and algorithms such as coverage, patrolling, source seeking, exploration and search and rescue. The unified framework represents a step towards bridging the divide between theoretical advancements in reinforcement learning and real-world applications in environmental monitoring. A critical review of the literature in this field is carried out and it is found that despite the potential of reinforcement learning for environmental active sensing applications there is still a lack of practical implementation and most work remains in the simulation phase. It is also noted that despite the consensus that, multi-agent systems are crucial to fully realize the potential of active sensing there is a lack of research in this area.

PMID:38533524 | PMC:PMC10964253 | DOI:10.3389/frobt.2024.1336612

Categories: Literature Watch

A review of mechanistic learning in mathematical oncology

Wed, 2024-03-27 06:00

Front Immunol. 2024 Mar 12;15:1363144. doi: 10.3389/fimmu.2024.1363144. eCollection 2024.

ABSTRACT

Mechanistic learning refers to the synergistic combination of mechanistic mathematical modeling and data-driven machine or deep learning. This emerging field finds increasing applications in (mathematical) oncology. This review aims to capture the current state of the field and provides a perspective on how mechanistic learning may progress in the oncology domain. We highlight the synergistic potential of mechanistic learning and point out similarities and differences between purely data-driven and mechanistic approaches concerning model complexity, data requirements, outputs generated, and interpretability of the algorithms and their results. Four categories of mechanistic learning (sequential, parallel, extrinsic, intrinsic) of mechanistic learning are presented with specific examples. We discuss a range of techniques including physics-informed neural networks, surrogate model learning, and digital twins. Example applications address complex problems predominantly from the domain of oncology research such as longitudinal tumor response predictions or time-to-event modeling. As the field of mechanistic learning advances, we aim for this review and proposed categorization framework to foster additional collaboration between the data- and knowledge-driven modeling fields. Further collaboration will help address difficult issues in oncology such as limited data availability, requirements of model transparency, and complex input data which are embraced in a mechanistic learning framework.

PMID:38533513 | PMC:PMC10963621 | DOI:10.3389/fimmu.2024.1363144

Categories: Literature Watch

The automated Greulich and Pyle: a coming-of-age for segmental methods?

Wed, 2024-03-27 06:00

Front Artif Intell. 2024 Mar 12;7:1326488. doi: 10.3389/frai.2024.1326488. eCollection 2024.

ABSTRACT

The well-known Greulich and Pyle (GP) method of bone age assessment (BAA) relies on comparing a hand X-ray against templates of discrete maturity classes collected in an atlas. Automated methods have recently shown great success with BAA, especially using deep learning. In this perspective, we first review the success and limitations of various automated BAA methods. We then offer a novel hypothesis: When networks predict bone age that is not aligned with a GP reference class, it is not simply statistical error (although there is that as well); they are picking up nuances in the hand X-ray that lie "outside that class." In other words, trained networks predict distributions around classes. This raises a natural question: How can we further understand the reasons for a prediction to deviate from the nominal class age? We claim that segmental aging, that is, ratings based on characteristic bone groups can be used to qualify predictions. This so-called segmental GP method has excellent properties: It can not only help identify differential maturity in the hand but also provide a systematic way to extend the use of the current GP atlas to various other populations.

PMID:38533467 | PMC:PMC10963464 | DOI:10.3389/frai.2024.1326488

Categories: Literature Watch

Deep CANALs: a deep learning approach to refining the canalization theory of psychopathology

Wed, 2024-03-27 06:00

Neurosci Conscious. 2024 Mar 26;2024(1):niae005. doi: 10.1093/nc/niae005. eCollection 2024.

ABSTRACT

Psychedelic therapy has seen a resurgence of interest in the last decade, with promising clinical outcomes for the treatment of a variety of psychopathologies. In response to this success, several theoretical models have been proposed to account for the positive therapeutic effects of psychedelics. One of the more prominent models is "RElaxed Beliefs Under pSychedelics," which proposes that psychedelics act therapeutically by relaxing the strength of maladaptive high-level beliefs encoded in the brain. The more recent "CANAL" model of psychopathology builds on the explanatory framework of RElaxed Beliefs Under pSychedelics by proposing that canalization (the development of overly rigid belief landscapes) may be a primary factor in psychopathology. Here, we make use of learning theory in deep neural networks to develop a series of refinements to the original CANAL model. Our primary theoretical contribution is to disambiguate two separate optimization landscapes underlying belief representation in the brain and describe the unique pathologies which can arise from the canalization of each. Along each dimension, we identify pathologies of either too much or too little canalization, implying that the construct of canalization does not have a simple linear correlation with the presentation of psychopathology. In this expanded paradigm, we demonstrate the ability to make novel predictions regarding what aspects of psychopathology may be amenable to psychedelic therapy, as well as what forms of psychedelic therapy may ultimately be most beneficial for a given individual.

PMID:38533457 | PMC:PMC10965250 | DOI:10.1093/nc/niae005

Categories: Literature Watch

Errata: Erratum to "Deep Learning for Fast and Spatially Constrained Tissue Quantification From Highly Accelerated Data in Magnetic Resonance Fingerprinting"

Wed, 2024-03-27 06:00

IEEE Trans Med Imaging. 2020 Feb;39(2):543. doi: 10.1109/TMI.2019.2962345. Epub 2020 Feb 3.

ABSTRACT

[This corrects the article PMC6692257.].

PMID:38533340 | PMC:PMC10965242 | DOI:10.1109/TMI.2019.2962345

Categories: Literature Watch

Sugarcane leaf dataset: A dataset for disease detection and classification for machine learning applications

Wed, 2024-03-27 06:00

Data Brief. 2024 Feb 29;53:110268. doi: 10.1016/j.dib.2024.110268. eCollection 2024 Apr.

ABSTRACT

Sugarcane, a vital crop for the global sugar industry, is susceptible to various diseases that significantly impact its yield and quality. Accurate and timely disease detection is crucial for effective management and prevention strategies. We persent the "Sugarcane Leaf Dataset" consisting of 6748 high-resolution leaf images classified into nine disease categories, a healthy leaves category, and a dried leaves category. The dataset covers diseases such as smut, yellow leaf disease, pokkah boeng, mosale, grassy shoot, brown spot, brown rust, banded cholorsis, and sett rot. The dataset's potential for reuse is significant. The provided dataset serves as a valuable resource for researchers and practitioners interested in developing machine learning algorithms for disease detection and classification in sugarcane leaves. By leveraging this dataset, various machine learning techniques can be applied, including deep learning, feature extraction, and pattern recognition, to enhance the accuracy and efficiency of automated sugarcane disease identification systems. The open availability of this dataset encourages collaboration within the scientific community, expediting research on disease control strategies and improving sugarcane production. By leveraging the "Sugarcane Leaf Dataset," we can advance disease detection, monitoring, and management in sugarcane cultivation, leading to enhanced agricultural practices and higher crop yields.

PMID:38533124 | PMC:PMC10964057 | DOI:10.1016/j.dib.2024.110268

Categories: Literature Watch

Enhanced prediction of stock markets using a novel deep learning model PLSTM-TAL in urbanized smart cities

Wed, 2024-03-27 06:00

Heliyon. 2024 Mar 13;10(6):e27747. doi: 10.1016/j.heliyon.2024.e27747. eCollection 2024 Mar 30.

ABSTRACT

Accurate predictions of stock markets are important for investors and other stakeholders of the equity markets to formulate profitable investment strategies. The improved accuracy of a prediction model even with a slight margin can translate into considerable monetary returns. However, the stock markets' prediction is regarded as an intricate research problem for the noise, complexity and volatility of the stocks' data. In recent years, the deep learning models have been successful in providing robust forecasts for sequential data. We propose a novel deep learning-based hybrid classification model by combining peephole LSTM with temporal attention layer (TAL) to accurately predict the direction of stock markets. The daily data of four world indices including those of U.S., U.K., China and India, from 2005 to 2022, are examined. We present a comprehensive evaluation with preliminary data analysis, feature extraction and hyperparameters' optimization for the problem of stock market prediction. TAL is introduced post peephole LSTM to select the relevant information with respect to time and enhance the performance of the proposed model. The prediction performance of the proposed model is compared with that of the benchmark models CNN, LSTM, SVM and RF using evaluation metrics of accuracy, precision, recall, F1-score, AUC-ROC, PR-AUC and MCC. The experimental results show the superior performance of our proposed model achieving better scores than the benchmark models for most evaluation metrics and for all datasets. The accuracy of the proposed model is 96% and 88% for U.K. and Chinese stock markets respectively and it is 85% for both U.S. and Indian markets. Hence, the stock markets of U.K. and China are found to be more predictable than those of U.S. and India. Significant findings of our work include that the attention layer enables peephole LSTM to better identify the long-term dependencies and temporal patterns in the stock markets' data. Profitable and timely trading strategies can be formulated based on our proposed prediction model.

PMID:38533061 | PMC:PMC10963254 | DOI:10.1016/j.heliyon.2024.e27747

Categories: Literature Watch

Image detection of aortic dissection complications based on multi-scale feature fusion

Wed, 2024-03-27 06:00

Heliyon. 2024 Mar 15;10(6):e27678. doi: 10.1016/j.heliyon.2024.e27678. eCollection 2024 Mar 30.

ABSTRACT

BACKGROUND: Aortic dissection refers to the true and false two-lumen separation of the aortic wall, in which the blood in the aortic lumen enters the aortic mesomembrane from the tear of the aortic intima to separate the mesomembrane and expand along the long axis of the aorta.

PURPOSE: In view of the problems of individual differences, complex complications and many small targets in clinical aortic dissection detection, this paper proposes a convolution neural network MFF-FPN (Multi-scale Feature Fusion based Feature Pyramid Network) for the detection of aortic dissection complications.

METHODS: The proposed model uses Resnet50 as the backbone for feature extraction and builds a pyramid structure to fuse low-level and high-level feature information. We add an attention mechanism to the backbone network, which can establish inter-dependencies between feature graph channels and enhance the representation quality of CNN.

RESULTS: The proposed method has a mean average precision (MAP) of 99.40% in the task of multi object detection for aortic dissection and complications, which is higher than the accuracy of 96.3% on SSD model and 99.05% on YoloV7 model. It greatly improves the accuracy of small target detection such as cysts, making it more suitable for clinical focus detection.

CONCLUSIONS: The proposed deep learning model achieves feature reuse and focuses on local important information. By adding only a small number of model parameters, we are able to greatly improve the detection accuracy, which is effective in detecting small target lesions commonly found in clinical settings, and also performs well on other medical and natural datasets.

PMID:38533058 | PMC:PMC10963251 | DOI:10.1016/j.heliyon.2024.e27678

Categories: Literature Watch

Accumulative Assessment of Upper Extremity

Tue, 2024-03-26 06:00

Phys Ther. 2024 Mar 26:pzae050. doi: 10.1093/ptj/pzae050. Online ahead of print.

ABSTRACT

OBJECTIVE: The Fugl-Meyer assessment for upper extremity (FMA-UE) is a measure for assessing upper extremity motor function in patients with stroke. However, the considerable administration time of the assessment decreases its feasibility. This study aimed to develop an accumulative assessment system of upper extremity motor function (AAS-UE) based on the FMA-UE to improve administrative efficiency while retaining sufficient psychometric properties.

METHODS: The study used secondary data from 3 previous studies having FMA-UE datasets, including 2 follow-up studies for subacute stroke individuals and 1 test-retest study for individuals with chronic stroke. The AAS-UE adopted deep learning algorithms to use patients' prior information (ie, the FMA-UE scores in previous assessments, time interval of adjacent assessments, and chronicity of stroke) to select a short and personalized item set for the following assessment items and reproduce their FMA-UE scores.

RESULTS: Our data included a total of 682 patients after stroke. The AAS-UE administered 10 different items for each patient. The AAS-UE demonstrated good concurrent validity (r = 0.97-0.99 with the FMA-UE), high test-retest reliability (intra-class correlation coefficient = 0.96), low random measurement error (percentage of minimal detectable change = 15.6%), good group-level responsiveness (standardized response mean = 0.65-1.07), and good individual-level responsiveness (30.5%-53.2% of patients showed significant improvement). These psychometric properties were comparable to those of the FMA-UE.

CONCLUSION: The AAS-UE uses an innovative assessment method which makes good use of patients' prior information to achieve administrative efficiency with good psychometric properties.

IMPACT: This study demonstrates a new assessment method to improve administrative efficiency while retaining psychometric properties, especially individual-level responsiveness and random measurement error, by making good use of patients' basic information and medical records.

PMID:38531775 | DOI:10.1093/ptj/pzae050

Categories: Literature Watch

Deep learning-powered enzyme efficiency boosting with evolutionary information

Tue, 2024-03-26 06:00

Sci Bull (Beijing). 2024 Mar 19:S2095-9273(24)00190-7. doi: 10.1016/j.scib.2024.03.034. Online ahead of print.

NO ABSTRACT

PMID:38531716 | DOI:10.1016/j.scib.2024.03.034

Categories: Literature Watch

Rapid discrimination and ratio quantification of mixed antibiotics in aqueous solution through integrative analysis of SERS spectra via CNN combined with NN-EN model

Tue, 2024-03-26 06:00

J Adv Res. 2024 Mar 24:S2090-1232(24)00116-4. doi: 10.1016/j.jare.2024.03.016. Online ahead of print.

ABSTRACT

INTRODUCTION: Abusing antibiotic residues in the natural environment has become a severe public health and ecological environmental problem. The side effects of its biochemical and physiological consequences are severe. To avoid antibiotic contamination in water, implementing universal and rapid antibiotic residue detection technology is critical to maintaining antibiotic safety in aquatic environments. Surface-enhanced Raman spectroscopy (SERS) provides a powerful tool for identifying small molecular components with high sensitivity and selectivity. However, it remains a challenge to identify pure antibiotics from SERS spectra due to coexisting components in the mixture.

OBJECTIVES: In this study, an intelligent analysis model for the SERS spectrum based on a deep learning algorithm was proposed for rapid identification of the antibiotic components in the mixture and quantitative determination of the ratios of these components.

METHODS: We established a water environment system containing three antibiotic residues of ciprofloxacin, doxycycline, and levofloxacin. To facilitate qualitative and quantitative analysis of the SERS spectra antibiotic mixture datasets, we developed a computational framework integrating a convolutional neural network (CNN) and a non-negative elastic network (NN-EN) method.

RESULTS: The experimental results demonstrate that the CNN model has a recognition accuracy of 98.68%, and the interpretation analysis of Shapley Additive exPlanations (SHAP) shows that our model can specifically focus on the characteristic peak distribution. In contrast, the NN-EN model can accurately quantify each component's ratio in the mixture.

CONCLUSION: Integrating the SERS technique assisted by the CNN combined with the NN-EN model exhibits great potential for rapid identification and high-precision quantification of antibiotic residues in aquatic environments.

PMID:38531495 | DOI:10.1016/j.jare.2024.03.016

Categories: Literature Watch

Extensive T1-weighted MRI preprocessing improves generalizability of deep brain age prediction models

Tue, 2024-03-26 06:00

Comput Biol Med. 2024 Mar 20;173:108320. doi: 10.1016/j.compbiomed.2024.108320. Online ahead of print.

ABSTRACT

Brain age is an estimate of chronological age obtained from T1-weighted magnetic resonance images (T1w MRI), representing a straightforward diagnostic biomarker of brain aging and associated diseases. While the current best accuracy of brain age predictions on T1w MRIs of healthy subjects ranges from two to three years, comparing results across studies is challenging due to differences in the datasets, T1w preprocessing pipelines, and evaluation protocols used. This paper investigates the impact of T1w image preprocessing on the performance of four deep learning brain age models from recent literature. Four preprocessing pipelines, which differed in terms of registration transform, grayscale correction, and software implementation, were evaluated. The results showed that the choice of software or preprocessing steps could significantly affect the prediction error, with a maximum increase of 0.75 years in mean absolute error (MAE) for the same model and dataset. While grayscale correction had no significant impact on MAE, using affine rather than rigid registration to brain atlas statistically significantly improved MAE. Models trained on 3D images with isotropic 1mm3 resolution exhibited less sensitivity to the T1w preprocessing variations compared to 2D models or those trained on downsampled 3D images. Our findings indicate that extensive T1w preprocessing improves MAE, especially when predicting on a new dataset. This runs counter to prevailing research literature, which suggests that models trained on minimally preprocessed T1w scans are better suited for age predictions on MRIs from unseen scanners. We demonstrate that, irrespective of the model or T1w preprocessing used during training, applying some form of offset correction is essential to enable the model's performance to generalize effectively on datasets from unseen sites, regardless of whether they have undergone the same or different T1w preprocessing as the training set.

PMID:38531250 | DOI:10.1016/j.compbiomed.2024.108320

Categories: Literature Watch

Dual-TranSpeckle: Dual-pathway transformer based encoder-decoder network for medical ultrasound image despeckling

Tue, 2024-03-26 06:00

Comput Biol Med. 2024 Mar 21;173:108313. doi: 10.1016/j.compbiomed.2024.108313. Online ahead of print.

ABSTRACT

The majority of existing deep learning-based image denoising algorithms mainly focus on processing the overall image features, ignoring the fine differences between the semantic and pixel features. Hence, we propose Dual-TranSpeckle (DTS), a medical ultrasound image despeckling network built on a dual-path Transformer. The DTS introduces two different paths, named "semantic path" and "pixel path," to facilitate the parallel transfer of feature information within the image. The semantic path passes a global view of the input semantic features, and the image features are passed through a Semantic Block to extract global semantic information from pixel-level features. The pixel path is employed to transmit finer-grained pixel features. Within the dual-path network framework, two essential modules, namely Dual Block and Merge Block, are designed. These leverage the Transformer architecture during the encoding and decoding stages. The Dual Block module facilitates information interaction between the semantic and pixel features by considering the interdependencies across both paths. Meanwhile, the Merge Block module enables parallel transfer of feature information by merging the dual path features, thereby facilitating the self-attention calculations for the overall feature representation. Our DTS is extensively evaluated on two public datasets and one private dataset. The DTS network demonstrates significant enhancements in quantitative evaluation results in terms of peak signal-to-noise ratio (PSNR), structural similarity (SSIM), feature similarity (FSIM), and naturalness image quality evaluator (NIQE). Furthermore, our qualitative analysis confirms that the DTS has significant improvements in despeckling performance, effectively suppressing speckle noise while preserving essential image structures.

PMID:38531247 | DOI:10.1016/j.compbiomed.2024.108313

Categories: Literature Watch

Pages