Systems Biology

Pain Trajectories in Pediatric Inflammatory Bowel Disease: Disease Severity, Optimism, and Pain Self-efficacy

Fri, 2025-03-07 06:00

Clin J Pain. 2025 Mar 7. doi: 10.1097/AJP.0000000000001279. Online ahead of print.

ABSTRACT

OBJECTIVES: This study aimed to characterize pain intensity (average, worst) and disease severity in youth with inflammatory bowel disease in the 12-months post-diagnosis, and to examine the relation between pain and risk (disease severity) and resilience (optimism, pain self-efficacy) factors over time.

METHODS: Data collection ran from February 2019 to March 2022. Newly diagnosed youth aged 8-17 with IBD completed numerical rating scales for average and worst pain intensity, Youth Life Orientation Test for optimism, and Pain Self-Efficacy Scale for pain self-efficacy via REDCap; weighted Pediatric Crohn's Disease Activity Index and the Pediatric Ulcerative Colitis Activity Index were used as indicators of disease severity. Descriptive statistics characterized pain and disease severity. Multilevel modeling explored relations between variables over time, including moderation effects of optimism and pain self-efficacy.

RESULTS: At baseline, 83 youth (Mage=13.9, SD=2.6; 60.2% Crohn's disease; 39.8% female) were included. Attrition rates at 4 and 12 months were 6.0% and 9.6%, respectively. Across time, at least 52% of participants reported pain. Participants in disease remission increased from 4% to 70% over 12-months. Higher disease severity predicted higher worst pain, regardless of time since diagnosis. Higher pain self-efficacy: (a) predicted lower average and worst pain, especially at later time points; and (b) attenuated the association between disease severity and worst pain when included as a moderator. Higher optimism predicted lower worst pain.

DISCUSSION: Pain is prevalent in pediatric inflammatory bowel disease and impacted by disease severity, pain self-efficacy, and optimism. Findings highlight modifiable intervention targets.

PMID:40052200 | DOI:10.1097/AJP.0000000000001279

Categories: Literature Watch

Regulation of gene expression through protein-metabolite interactions

Fri, 2025-03-07 06:00

NPJ Metab Health Dis. 2025;3(1):7. doi: 10.1038/s44324-024-00047-w. Epub 2025 Mar 4.

ABSTRACT

Organisms have to adapt to changes in their environment. Cellular adaptation requires sensing, signalling and ultimately the activation of cellular programs. Metabolites are environmental signals that are sensed by proteins, such as metabolic enzymes, protein kinases and nuclear receptors. Recent studies have discovered novel metabolite sensors that function as gene regulatory proteins such as chromatin associated factors or RNA binding proteins. Due to their function in regulating gene expression, metabolite-induced allosteric control of these proteins facilitates a crosstalk between metabolism and gene expression. Here we discuss the direct control of gene regulatory processes by metabolites and recent progresses that expand our abilities to systematically characterize metabolite-protein interaction networks. Obtaining a profound map of such networks is of great interest for aiding metabolic disease treatment and drug target identification.

PMID:40052108 | PMC:PMC11879850 | DOI:10.1038/s44324-024-00047-w

Categories: Literature Watch

Corrigendum: Integrated network analysis to identify key modules and potential hub genes involved in bovine respiratory disease: a systems biology approach

Fri, 2025-03-07 06:00

Front Genet. 2025 Feb 20;16:1572285. doi: 10.3389/fgene.2025.1572285. eCollection 2025.

ABSTRACT

[This corrects the article DOI: 10.3389/fgene.2021.753839.].

PMID:40051703 | PMC:PMC11882524 | DOI:10.3389/fgene.2025.1572285

Categories: Literature Watch

Thermodynamic modeling of RsmA - mRNA interactions capture novel direct binding across the <em>Pseudomonas aeruginosa</em> transcriptome

Fri, 2025-03-07 06:00

Front Mol Biosci. 2025 Feb 20;12:1493891. doi: 10.3389/fmolb.2025.1493891. eCollection 2025.

ABSTRACT

Pseudomonas aeruginosa (PA) is a ubiquitous, Gram-negative, bacteria that can attribute its survivability to numerous sensing and signaling pathways; conferring fitness due to speed of response. Post-transcriptional regulation is an energy efficient approach to quickly shift gene expression in response to the environment. The conserved post-transcriptional regulator RsmA is involved in regulating translation of genes involved in pathways that contribute to virulence, metabolism, and antibiotic resistance. Prior high-throughput approaches to map the full regulatory landscape of RsmA have estimated a target pool of approximately 500 genes; however, these approaches have been limited to a narrow range of growth phase, strain, and media conditions. Computational modeling presents a condition-independent approach to generating predictions for binding between the RsmA protein and highest affinity mRNAs. In this study, we improve upon a two-state thermodynamic model to predict the likelihood of RsmA binding to the 5' UTR sequence of genes present in the PA genome. Our modeling approach predicts 1043 direct RsmA-mRNA binding interactions, including 457 novel mRNA targets. We then perform GO term enrichment tests on our predictions that reveal significant enrichment for DNA binding transcriptional regulators. In addition, quorum sensing, biofilm formation, and two-component signaling pathways were represented in KEGG enrichment analysis. We confirm binding predictions using in vitro binding assays, and regulatory effects using in vivo translational reporters. These reveal RsmA binding and regulation of a broader number of genes not previously reported. An important new observation of this work is the direct regulation of several novel mRNA targets encoding for factors involved in Quorum Sensing and the Type IV Secretion system, such as rsaL and mvaT. Our study demonstrates the utility of thermodynamic modeling for predicting interactions independent of complex and environmentally-sensitive systems, specifically for profiling the post-transcriptional regulator RsmA. Our experimental validation of RsmA binding to novel targets both supports our model and expands upon the pool of characterized target genes in PA. Overall, our findings demonstrate that a modeling approach can differentiate direct from indirect binding interactions and predict specific sites of binding for this global regulatory protein, thus broadening our understanding of the role of RsmA regulation in this relevant pathogen.

PMID:40051501 | PMC:PMC11882435 | DOI:10.3389/fmolb.2025.1493891

Categories: Literature Watch

Mobilome-Mediated Speciation: Genomic Insights Into Horizontal Gene Transfer in Methanosarcina

Fri, 2025-03-07 06:00

J Basic Microbiol. 2025 Mar 6:e70013. doi: 10.1002/jobm.70013. Online ahead of print.

ABSTRACT

Speciation in prokaryotes is often driven by complex genetic exchanges such as horizontal gene transfer (HGT), which facilitates genomic divergence and adaptation. In this study, we inferred the evolutionary transitions of the mobilome (plasmids, transposons, and phages) between Methanosarcina and bacteria in driving speciation within the Methanosarcina genus. By conducting evolutionary and phylogenetic analyses of Methanosarcina acetivorans, M. barkeri, M. mazei, and M. siciliae, we identified key mobilome elements acquired through HGT from distantly related bacterial species. These mobile genetic elements have shaped genomic plasticity, enabling Methanosarcina to adapt to diverse environmental niches and potentially facilitating lineage divergence. The acquisition of mobilome-associated genes involved in antibiotic resistance, DNA repair, and stress responses suggests their significant role in the ecological speciation of Methanosarcina. Overall, we hypothesized that their mobile genetic element might have been acquired from distantly related bacteria by HGT and subsequently established as new functional homologs in the present lineage. This study provides insight into how mobilome-mediated gene flow contributes to genomic divergence and speciation within microbial populations, highlighting the broader significance of mobilome in microbial evolution and speciation processes.

PMID:40051073 | DOI:10.1002/jobm.70013

Categories: Literature Watch

Convergent Evolution of Coenzyme Metabolism in Methanosarcina mazei: Insights Into Primitive Life and Metabolic Adaptations

Fri, 2025-03-07 06:00

J Basic Microbiol. 2025 Mar 6:e70015. doi: 10.1002/jobm.70015. Online ahead of print.

ABSTRACT

The convergent evolution of coenzyme metabolism in methanogens provides critical insights into primitive life and metabolic adaptations. This study investigated the molecular evolution and functional dynamics of eight coenzymes and cofactors in Methanosarcina mazei, a model methanogen essential for methane production and energy conservation in anaerobic environments. Phylogenetic and genetic diversity analyses of the 706 protein sequences revealed conserved evolutionary trajectories interspersed with lineage-specific adaptations driven by gene duplication, horizontal gene transfer, and selective pressures. Key findings included the purifying selection of methanofuran (Tajima's D = -2.9589) and coenzyme A (Tajima's D = -2.8555), indicating the conservation of critical metabolic functions. The coenzyme B biosynthesis pathway showed balanced selection (Tajima's D = 2.38602), reflecting its evolutionary plasticity. Phylogenetic analyses linked coenzyme F420 biosynthetic enzymes closely to Methanosarcina horonobensis, while coenzyme F430 enzymes highlighted prokaryotic specialization distinct from their eukaryotes. Coenzyme M biosynthetic genes have demonstrated unique evolutionary connections with species across domains, such as Methanothermobacter thermautotrophicus and Gekko japonicus, emphasizing their broad adaptive significance. These evolutionary trajectories reveal how M. mazei optimized its metabolic pathways to thrive in extreme anaerobic environments, bridging ancient metabolic systems from the Last Universal Common Ancestor with contemporary ecological adaptations.

PMID:40051064 | DOI:10.1002/jobm.70015

Categories: Literature Watch

Developing resistance to Fusarium wilt in chickpea: From identifying meta-QTLs to molecular breeding

Thu, 2025-03-06 06:00

Plant Genome. 2025 Mar;18(1):e70004. doi: 10.1002/tpg2.70004.

ABSTRACT

Fusarium wilt (FW) significantly affects the growth and development of chickpea (Cicer arietinum L.), leading to substantial economic losses. FW resistance is a quantitative trait that is controlled by multiple genomic regions. In this study, a meta-analysis was conducted on 32 quantitative trait loci (QTLs) associated with FW resistance, leading to the identification of seven meta-QTL (MQTL) regions distributed across CaLG2, CaLG4, CaLG5, and CaLG6 of the chickpea linkage groups. The integrated analysis revealed several candidate genes potentially important for FW resistance, including genes associated with sensing (e.g., LRR-RLK), signaling (e.g., mitogen-activated protein kinase [MAPK1]), and transcription regulation (e.g., NAC, WRKY, and bZIP). Subsequently, a marker-assisted backcrossing (MABC) trial was executed leveraging the MQTL outcomes to introgress FW resistance from an FW-resistant chickpea cultivar (Ana) into a superior high-yielding Kabuli cultivar (Hashem). The breeding process was extended over 5 years (2018-2023) and resulted in the development of BC3F2 genotypes. Consequently, 12 genotypes carrying homozygous resistance alleles were chosen, with three genotypes showing genetic backgrounds matching 90%-96% of the recurrent parent. The findings of this study have significant implications for upcoming programs, encompassing fine-mapping, marker-assisted breeding, and genetic engineering, consequently contributing to the effective control of FW and the improved production of chickpea.

PMID:40050693 | DOI:10.1002/tpg2.70004

Categories: Literature Watch

The integrated stress response pathway controls cytokine production in tissue-resident memory CD4<sup>+</sup> T cells

Thu, 2025-03-06 06:00

Nat Immunol. 2025 Mar 6. doi: 10.1038/s41590-025-02105-x. Online ahead of print.

ABSTRACT

Tissue-resident memory T (TRM) cells are a specialized T cell population that reside in tissues and provide a rapid protective response upon activation. Here, we showed that human and mouse CD4+ TRM cells existed in a poised state and stored messenger RNAs encoding proinflammatory cytokines without protein production. At steady state, cytokine mRNA translation in TRM cells was suppressed by the integrated stress response (ISR) pathway. Upon activation, the central ISR regulator, eIF2α, was dephosphorylated and stored cytokine mRNA was translated for immediate cytokine production. Genetic or pharmacological activation of the ISR-eIF2α pathway reduced cytokine production and ameliorated autoimmune kidney disease in mice. Consistent with these results, the ISR pathway in CD4+ TRM cells was downregulated in patients with immune-mediated diseases of the kidney and the intestine compared to healthy controls. Our results indicated that stored cytokine mRNA and translational regulation in CD4+ TRM cells facilitate rapid cytokine production during local immune response.

PMID:40050432 | DOI:10.1038/s41590-025-02105-x

Categories: Literature Watch

DrBioRight 2.0: an LLM-powered bioinformatics chatbot for large-scale cancer functional proteomics analysis

Thu, 2025-03-06 06:00

Nat Commun. 2025 Mar 6;16(1):2256. doi: 10.1038/s41467-025-57430-4.

ABSTRACT

Functional proteomics provides critical insights into cancer mechanisms, facilitating the discovery of novel biomarkers and therapeutic targets. We have developed a comprehensive cancer functional proteomics resource using reverse phase protein arrays, incorporating data from nearly 8000 patient samples from The Cancer Genome Atlas and approximately 900 samples from the Cancer Cell Line Encyclopedia. Our dataset includes a curated panel of nearly 500 high-quality antibodies, covering all major cancer hallmark pathways. To enhance the accessibility and analytic power of this resource, we introduce DrBioRight 2.0 ( https://drbioright.org ), an intuitive bioinformatic platform powered by state-of-the-art large language models. DrBioRight enables researchers to explore protein-centric cancer omics data, perform advanced analyses, visualize results, and engage in interactive discussions using natural language. By streamlining complex proteogenomic analyses, this tool accelerates the translation of large-scale functional proteomics data into meaningful biomedical insights.

PMID:40050282 | DOI:10.1038/s41467-025-57430-4

Categories: Literature Watch

The Halo of Future Bio-industry based on Engineering Halomonas

Thu, 2025-03-06 06:00

Metab Eng. 2025 Mar 4:S1096-7176(25)00031-X. doi: 10.1016/j.ymben.2025.03.001. Online ahead of print.

ABSTRACT

The utilization of microorganisms to transform biomass into biofuels and biochemicals presents a viable and competitive alternative to conventional petroleum refining processes. Halomonas species are salt-tolerant and alkaliphilic, endowed with various beneficial properties rendering them as contamination resistant platforms for industrial biotechnology, facilitating the commercial-scale production of valuable bioproducts. Here we summarized the metabolic and genomic engineering approaches, as well as the biochemical products synthesized by Halomonas. Methods were presented for expanding substrates utilization in Halomonas to enhance its capabilities as a robust workhorse for bioproducts. In addition, we briefly reviewed the Next Generation Industrial Biotechnology (NGIB) based on Halomonas for open and continuous fermentation. In particular, we proposed the industrial attempts from Halomonas chassis and the rising prospects and essential strategies to enable the successful development of Halomonas as microbial NGIB manufacturing platforms.

PMID:40049362 | DOI:10.1016/j.ymben.2025.03.001

Categories: Literature Watch

Investigation of southern Thailand sweet pickled mango metabolic profiles related to deterioration

Thu, 2025-03-06 06:00

Food Chem. 2025 Mar 1;478:143663. doi: 10.1016/j.foodchem.2025.143663. Online ahead of print.

ABSTRACT

Southern Thailand sweet pickled mango (MBC) is a famous delicacy and economically important for the local communities. This study aimed to elucidate important metabolites related to MBC deterioration at 4 °C (STR4) and 30 °C (STR30). The results show that deterioration of MBCs was linked to increased levels of ethyl acetate, isopropyl alcohol, trans-β-ocimene, isopentyl acetate, 2-phenethyl acetate, glucose, and fructose, along with a decrease in sucrose. Moreover, isopentyl acetate, ethyl acetate, and 2-phenethyl acetate were significantly higher in STR4 compared to STR30 with log 2[fold change (FC)] 3.2, 2.0, and 1.0, respectively. Meanwhile, STR4 had a lower sucrose level (log [FC] -1.4) than STR30. It was postulated that a longer storage time of STR4 than STR30 affects sucrose hydrolysis. Due to the abundance of volatile metabolites in deteriorated MBC, applying odor/flavor absorber film on MBC packaging might help prolong its shelf life.

PMID:40049138 | DOI:10.1016/j.foodchem.2025.143663

Categories: Literature Watch

Reduced function of the adaptor SH2B3 promotes T1D via altered cytokine-regulated, T cell intrinsic immune tolerance

Thu, 2025-03-06 06:00

Diabetes. 2025 Mar 6:db240655. doi: 10.2337/db24-0655. Online ahead of print.

ABSTRACT

Genome-wide association studies have identified SH2B3 as an important non-MHC gene for islet autoimmunity and type 1 diabetes (T1D). In this study, we found a single SH2B3 haplotype significantly associated with increased risk for human T1D. Fine mapping has demonstrated the most credible causative variant is the single nucleotide rs3184504*T polymorphism in SH2B3. To better characterize the role of SH2B3 in T1D, we used mouse modeling and found a T cellintrinsic role for SH2B3 regulating peripheral tolerance. SH2B3 deficiency had minimal effect on TCR signaling or proliferation across antigen doses, yet enhanced cell survival and cytokine signaling including common gamma chain-dependent and interferon-gamma receptor signaling. SH2B3 deficient naïve CD8+ T cells showed augmented STAT5-MYC and effector-related gene expression partially reversed with blocking autocrine IL-2 in culture. Using the RIP-mOVA model, we found CD8+ T cells lacking SH2B3 promoted early islet destruction and diabetes without requiring CD4+ T cell help. SH2B3-deficient cells demonstrated increased survival and reduced activation-induced cell death. Lastly, we created a spontaneous NOD.Sh2b3-/- mouse model and found markedly increased incidence and accelerated T1D across sexes. Collectively, these studies identify SH2B3 as a critical mediator of peripheral T cell tolerance limiting the T cell response to self-antigens.

PMID:40048557 | DOI:10.2337/db24-0655

Categories: Literature Watch

A subcellular map of translational machinery composition and regulation at the single-molecule level

Thu, 2025-03-06 06:00

Science. 2025 Mar 7;387(6738):eadn2623. doi: 10.1126/science.adn2623. Epub 2025 Mar 7.

ABSTRACT

Millions of ribosomes are packed within mammalian cells, yet we lack tools to visualize them in toto and characterize their subcellular composition. In this study, we present ribosome expansion microscopy (RiboExM) to visualize individual ribosomes and an optogenetic proximity-labeling technique (ALIBi) to probe their composition. We generated a super-resolution ribosomal map, revealing subcellular translational hotspots and enrichment of 60S subunits near polysomes at the endoplasmic reticulum (ER). We found that Lsg1 tethers 60S to the ER and regulates translation of select proteins. Additionally, we discovered ribosome heterogeneity at mitochondria guiding translation of metabolism-related transcripts. Lastly, we visualized ribosomes in neurons, revealing a dynamic switch between monosomes and polysomes in neuronal translation. Together, these approaches enable exploration of ribosomal localization and composition at unprecedented resolution.

PMID:40048539 | DOI:10.1126/science.adn2623

Categories: Literature Watch

Systematic identification of allosteric effectors in <em>Escherichia coli</em> metabolism

Thu, 2025-03-06 06:00

Proc Natl Acad Sci U S A. 2025 Mar 11;122(10):e2423767122. doi: 10.1073/pnas.2423767122. Epub 2025 Mar 6.

ABSTRACT

Recent physical binding screens suggest that protein-metabolite interactions are more extensive than previously recognized. To elucidate the functional relevance of these interactions, we developed a mass spectrometry-based screening method for higher throughput in vitro enzyme assays. By systematically quantifying the effects of 79 metabolites on the activity of 20 central Escherichia coli enzymes, we not only assess functional relevance but also gauge the depth of the current understanding of regulatory interactions within one of the best-characterized networks. Our identification of 50 inhibitors and 14 activators not only expands the range of known input signals but also uncovers novel regulatory logic. For instance, we observed that AMP inhibits malic enzyme to safeguard the cyclic operation of the tricarboxylic acid cycle, and erythrose-4-phosphate inhibits 6-phosphogluconate dehydrogenase to redirect flux from the pentose phosphate pathway into the Entner-Doudoroff pathway. Discrepancies between our standardized assays and existing database entries suggest that many previously reported interactions might occur only under specific, often nonphysiological conditions. Our dataset represents a systematically determined functional protein-metabolite interaction network, establishing a baseline for allosteric regulation in central metabolism. These results enhance our understanding of the regulatory logic governing metabolic processes and underscore its significance in cellular adaptation and growth.

PMID:40048276 | DOI:10.1073/pnas.2423767122

Categories: Literature Watch

Synthetic Genetic Elements Enable Rapid Characterization of Inorganic Carbon Uptake Systems in <em>Cupriavidus necator</em> H16

Thu, 2025-03-06 06:00

ACS Synth Biol. 2025 Mar 6. doi: 10.1021/acssynbio.4c00869. Online ahead of print.

ABSTRACT

Cupriavidus necator H16 is a facultative chemolithotroph capable of using CO2 as a carbon source, making it a promising organism for carbon-negative biomanufacturing of petroleum-based product alternatives. In contrast to model microbes, genetic engineering technologies are limited in C. necator, constraining its utility in basic and applied research. Here, we developed a genome engineering technology to efficiently mobilize, integrate, and express synthetic genetic elements (SGEs) in C. necator. We tested the chromosomal expression of four inducible promoters to optimize an engineered genetic landing pad for tunable gene expression. To demonstrate utility, we employed the SGE system to design, mobilize, and express eight heterologous inorganic carbon uptake pathways in C. necator. We demonstrated all inorganic carbon uptake systems' upregulated intracellular bicarbonate concentrations under heterotrophic conditions. This work establishes the utility of the SGE strategy for expedited integration and tunable expression of heterologous pathways, and enhances intracellular bicarbonate concentrations in C. necator.

PMID:40048245 | DOI:10.1021/acssynbio.4c00869

Categories: Literature Watch

Correction: Metabolic response of Klebsiella oxytoca to ciprofloxacin exposure: a metabolomics approach

Thu, 2025-03-06 06:00

Metabolomics. 2025 Mar 6;21(2):38. doi: 10.1007/s11306-025-02234-2.

NO ABSTRACT

PMID:40048010 | DOI:10.1007/s11306-025-02234-2

Categories: Literature Watch

Evaluation of information flows in the RAS-MAPK system using transfer entropy measurements

Thu, 2025-03-06 06:00

Elife. 2025 Mar 6;14:e104432. doi: 10.7554/eLife.104432.

ABSTRACT

The RAS-MAPK system plays an important role in regulating various cellular processes, including growth, differentiation, apoptosis, and transformation. Dysregulation of this system has been implicated in genetic diseases and cancers affecting diverse tissues. To better understand the regulation of this system, we employed information flow analysis based on transfer entropy (TE) between the activation dynamics of two key elements in cells stimulated with EGF: SOS, a guanine nucleotide exchanger for the small GTPase RAS, and RAF, a RAS effector serine/threonine kinase. TE analysis allows for model-free assessment of the timing, direction, and strength of the information flow regulating the system response. We detected significant amounts of TE in both directions between SOS and RAF, indicating feedback regulation. Importantly, the amount of TE did not simply follow the input dose or the intensity of the causal reaction, demonstrating the uniqueness of TE. TE analysis proposed regulatory networks containing multiple tracks and feedback loops and revealed temporal switching in the reaction pathway primarily responsible for reaction control. This proposal was confirmed by the effects of an MEK inhibitor on TE. Furthermore, TE analysis identified the functional disorder of a SOS mutation associated with Noonan syndrome, a human genetic disease, of which the pathogenic mechanism has not been precisely known yet. TE assessment holds significant promise as a model-free analysis method of reaction networks in molecular pharmacology and pathology.

PMID:40047537 | DOI:10.7554/eLife.104432

Categories: Literature Watch

Evaluation of antiobesogenic properties of fermented foods: In silico insights

Thu, 2025-03-06 06:00

J Food Sci. 2025 Mar;90(3):e70074. doi: 10.1111/1750-3841.70074.

ABSTRACT

Obesity prevalence has steadily increased over the past decades. Standard approaches, such as increased energy expenditure, lifestyle changes, a balanced diet, and the use of specific drugs, are the conventional strategies for preventing or treating the disease and its associated complications. Fermented foods and their subsequent bioactive constituents are now believed to be a novel strategy that can complement already existing approaches for managing and preventing this disease. Recent developments in systems biology and bioinformatics have made it possible to model and simulate compounds and disease interactions. The adoption of such in silico models has contributed to the discovery of novel fermented product targets and helped in testing hypotheses regarding the mechanistic impact and underlying functions of fermented food components. From the studies explored, key findings suggest that fermented foods affect adipogenesis, lipid metabolism, appetite regulation, gut microbiota composition, insulin resistance, and inflammation related to obesity, which could lead to new ways to treat these conditions. These outcomes were linked to probiotics, prebiotics, metabolites, and complex bioactive substances produced during fermentation. Overall, fermented foods and their bioactive compounds show promise as innovative tools for obesity management by influencing metabolic pathways and overall gut health.

PMID:40047326 | DOI:10.1111/1750-3841.70074

Categories: Literature Watch

Structuring data analysis projects in the Open Science era with Kerblam!

Thu, 2025-03-06 06:00

F1000Res. 2025 Jan 15;14:88. doi: 10.12688/f1000research.157325.1. eCollection 2025.

ABSTRACT

BACKGROUND: Structuring data analysis projects, that is, defining the layout of files and folders needed to analyze data using existing tools and novel code, largely follows personal preferences. Open Science calls for more accessible, transparent and understandable research. We believe that Open Science principles can be applied to the way data analysis projects are structured.

METHODS: We examine the structure of several data analysis project templates by analyzing project template repositories present in GitHub. Through visualization of the resulting consensus structure, we draw observations regarding how the ecosystem of project structures is shaped, and what salient characteristics it has.

RESULTS: Project templates show little overlap, but many distinct practices can be highlighted. We take them into account with the wider Open Science philosophy to draw a few fundamental Design Principles to guide researchers when designing a project space. We present Kerblam!, a project management tool that can work with such a project structure to expedite data handling, execute workflow managers, and share the resulting workflow and analysis outputs with others.

CONCLUSIONS: We hope that, by following these principles and using Kerblam!, the landscape of data analysis projects can become more transparent, understandable, and ultimately useful to the wider community.

PMID:40047014 | PMC:PMC11880754 | DOI:10.12688/f1000research.157325.1

Categories: Literature Watch

Pages