NIH Extramural Nexus News
Cell Type-Specific Transcriptional Control of Gsk3β in the Developing Mammalian Neocortex
Front Neurosci. 2022 Mar 23;16:811689. doi: 10.3389/fnins.2022.811689. eCollection 2022.
ABSTRACT
Temporal control of neurogenesis is central for the development and evolution of species-specific brain architectures. The balance between progenitor expansion and neuronal differentiation is tightly coordinated by cell-intrinsic and cell-extrinsic cues. Wnt signaling plays pivotal roles in the proliferation and differentiation of neural progenitors in a temporal manner. However, regulatory mechanisms that adjust intracellular signaling amplitudes according to cell fate progression remain to be elucidated. Here, we report the transcriptional controls of Gsk3β, a critical regulator of Wnt signaling, in the developing mouse neocortex. Gsk3β expression was higher in ventricular neural progenitors, while it gradually declined in differentiated neurons. We identified active cis-regulatory module (CRM) of Gsk3β that responded to cell type-specific transcription factors, such as Sox2, Sox9, and Neurogenin2. Furthermore, we found extensive conservation of the CRM among mammals but not in non-mammalian amniotes. Our data suggest that a mammalian-specific CRM drives the cell type-specific activity of Gsk3β to fine tune Wnt signaling, which contributes to the tight control of neurogenesis during neocortical development.
PMID:35401100 | PMC:PMC8983961 | DOI:10.3389/fnins.2022.811689
Identification and Analysis of Transcriptional Regulatory Networks of Osteosarcoma Microarray Data via Systems Biology
J Oleo Sci. 2022;71(3):379-386. doi: 10.5650/jos.ess21327.
ABSTRACT
Osteosarcoma is a relatively uncommon tumor that is defined histologically by malignant cells developing osteoid. Osteosarcomas are mesenchymal cell tumors that cause abnormal bone growth. A combination of genetic, epigenetic, and environmental factors leads mesenchymal stem cells to develop into bone precursor cells, resulting in osteosarcoma. Only tumor suppressor genes, such as p53, Rb, RECQL4, BLM, and WRN, have been detected in inherited family illnesses with an OS susceptibility. These genes, in particular, play an essential role in the development of OS in individuals. In this research, core genes responsible for OS were determined using a microarray and systems biology. 234 genes encoding overexpression and down-regulation were identified, among which 60 were considered as key genes, many of which had known roles in bone growth. Transcriptional regulatory networks were developed with this data and subsequently partitioned to define cis-regulatory modules. Results indicate that several OS-specific genes have strongly conserved the clustering of bone-related cis-regulatory modules, thus promoting the hypothesis that a bone-related gene network is essential for understanding OS biology and may play a role in bone contractility and anomalies.
PMID:35236797 | DOI:10.5650/jos.ess21327
Uncovering the mesendoderm gene regulatory network through multi-omic data integration
Cell Rep. 2022 Feb 15;38(7):110364. doi: 10.1016/j.celrep.2022.110364.
ABSTRACT
Mesendodermal specification is one of the earliest events in embryogenesis, where cells first acquire distinct identities. Cell differentiation is a highly regulated process that involves the function of numerous transcription factors (TFs) and signaling molecules, which can be described with gene regulatory networks (GRNs). Cell differentiation GRNs are difficult to build because existing mechanistic methods are low throughput, and high-throughput methods tend to be non-mechanistic. Additionally, integrating highly dimensional data composed of more than two data types is challenging. Here, we use linked self-organizing maps to combine chromatin immunoprecipitation sequencing (ChIP-seq)/ATAC-seq with temporal, spatial, and perturbation RNA sequencing (RNA-seq) data from Xenopus tropicalis mesendoderm development to build a high-resolution genome scale mechanistic GRN. We recover both known and previously unsuspected TF-DNA/TF-TF interactions validated through reporter assays. Our analysis provides insights into transcriptional regulation of early cell fate decisions and provides a general approach to building GRNs using highly dimensional multi-omic datasets.
PMID:35172134 | DOI:10.1016/j.celrep.2022.110364
Common Themes and Future Challenges in Understanding Gene Regulatory Network Evolution
Cells. 2022 Feb 1;11(3):510. doi: 10.3390/cells11030510.
ABSTRACT
A major driving force behind the evolution of species-specific traits and novel structures is alterations in gene regulatory networks (GRNs). Comprehending evolution therefore requires an understanding of the nature of changes in GRN structure and the responsible mechanisms. Here, we review two insect pigmentation GRNs in order to examine common themes in GRN evolution and to reveal some of the challenges associated with investigating changes in GRNs across different evolutionary distances at the molecular level. The pigmentation GRN in Drosophila melanogaster and other drosophilids is a well-defined network for which studies from closely related species illuminate the different ways co-option of regulators can occur. The pigmentation GRN for butterflies of the Heliconius species group is less fully detailed but it is emerging as a useful model for exploring important questions about redundancy and modularity in cis-regulatory systems. Both GRNs serve to highlight the ways in which redeployment of trans-acting factors can lead to GRN rewiring and network co-option. To gain insight into GRN evolution, we discuss the importance of defining GRN architecture at multiple levels both within and between species and of utilizing a range of complementary approaches.
PMID:35159319 | DOI:10.3390/cells11030510
Identification of cis-regulatory modules for adeno-associated virus-based cell type-specific targeting in the retina and brain
J Biol Chem. 2022 Feb 8:101674. doi: 10.1016/j.jbc.2022.101674. Online ahead of print.
ABSTRACT
Adeno Associated Viruses (AAVs) targeting specific cell types are powerful tools for studying distinct cell types in the central nervous system (CNS). Cis-regulatory modules (CRMs), e.g., enhancers, are highly cell type-specific and can be integrated into AAVs to render cell type specificity. Chromatin accessibility has been commonly used to nominate CRMs, which have then been incorporated into AAVs and tested for cell type-specificity in the CNS. However, chromatin accessibility data alone cannot accurately annotate active CRMs, as many chromatin-accessible CRMs are not active and fail to drive gene expression in vivo. Using available large-scale datasets on chromatin accessibility, such as those published by the ENCODE project, here we explored strategies to increase efficiency in identifying active CRMs for AAV-based cell type-specific labeling and manipulation. We found that pre-screening of chromatin-accessible putative CRMs based on the density of cell type-specific transcription factor binding sites (TFBSs) can significantly increase efficiency in identifying active CRMs. In addition, generation of synthetic CRMs by stitching chromatin-accessible regions flanking cell type-specific genes can render cell type-specificity in many cases. Using these straightforward strategies, we generated AAVs that can target the extensively studied interneuron and glial cell types in the retina and brain. Both strategies utilize available genomic datasets and can be employed to generate AAVs targeting specific cell types in CNS without conducting comprehensive screening and sequencing experiments, making a step forward in cell type-specific research.
PMID:35148987 | DOI:10.1016/j.jbc.2022.101674
Hox proteins interact to pattern neuronal subtypes in C. elegans males
Genetics. 2022 Feb 7:iyac010. doi: 10.1093/genetics/iyac010. Online ahead of print.
ABSTRACT
Hox transcription factors are conserved regulators of neuronal subtype specification on the anteroposterior axis in animals, with disruption of Hox gene expression leading to homeotic transformations of neuronal identities. We have taken advantage of an unusual mutation in the C. elegans Hox gene lin-39, lin-39(ccc16), which transforms neuronal fates in the C. elegans male ventral nerve cord in a manner that depends on a second Hox gene, mab-5. We have performed a genetic analysis centered around this homeotic allele of lin-39 in conjunction with reporters for neuronal target genes and protein interaction assays to explore how LIN-39 and MAB-5 exert both flexibility and specificity in target regulation. We identify cis-regulatory modules in neuronal reporters that are both region-specific and Hox-responsive. Using these reporters of neuronal subtype, we also find that the lin-39(ccc16) mutation disrupts neuronal fates specifically in the region where lin-39 and mab-5 are coexpressed, and that the protein encoded by lin-39(ccc16) is active only in the absence of mab-5. Moreover, the fates of neurons typical to the region of lin-39-mab-5 coexpression depend on both Hox genes. Our genetic analysis, along with evidence from Bimolecular Fluorescence Complementation (BiFC) protein interaction assays, supports a model in which LIN-39 and MAB-5 act at an array of cis regulatory modules to cooperatively activate and to individually activate or repress neuronal gene expression, resulting in regionally specific neuronal fates.
PMID:35137058 | DOI:10.1093/genetics/iyac010
regCNN: identifying Drosophila genome-wide cis-regulatory modules via integrating the local patterns in epigenetic marks and transcription factor binding motifs
Comput Struct Biotechnol J. 2021 Dec 18;20:296-308. doi: 10.1016/j.csbj.2021.12.015. eCollection 2022.
ABSTRACT
Transcription regulation in metazoa is controlled by the binding events of transcription factors (TFs) or regulatory proteins on specific modular DNA regulatory sequences called cis-regulatory modules (CRMs). Understanding the distributions of CRMs on a genomic scale is essential for constructing the metazoan transcriptional regulatory networks that help diagnose genetic disorders. While traditional reporter-assay CRM identification approaches can provide an in-depth understanding of functions of some CRM, these methods are usually cost-inefficient and low-throughput. It is generally believed that by integrating diverse genomic data, reliable CRM predictions can be made. Hence, researchers often first resort to computational algorithms for genome-wide CRM screening before specific experiments. However, current existing in silico methods for searching potential CRMs were restricted by low sensitivity, poor prediction accuracy, or high computation time from TFBS composition combinatorial complexity. To overcome these obstacles, we designed a novel CRM identification pipeline called regCNN by considering the base-by-base local patterns in TF binding motifs and epigenetic profiles. On the test set, regCNN shows an accuracy/auROC of 84.5%/92.5% in CRM identification. And by further considering local patterns in epigenetic profiles and TF binding motifs, it can accomplish 4.7% (92.5%-87.8%) improvement in the auROC value over the average value-based pure multi-layer perceptron model. We also demonstrated that regCNN outperforms all currently available tools by at least 11.3% in auROC values. Finally, regCNN is verified to be robust against its resizing window hyperparameter in dealing with the variable lengths of CRMs. The model of regCNN can be downloaded athttp://cobisHSS0.im.nuk.edu.tw/regCNN/.
PMID:35035784 | PMC:PMC8724954 | DOI:10.1016/j.csbj.2021.12.015
Cis-regulatory sequences in plants: their importance, discovery, and future challenges
Plant Cell. 2021 Nov 22:koab281. doi: 10.1093/plcell/koab281. Online ahead of print.
ABSTRACT
The identification and characterization of cis-regulatory DNA sequences and how they function to coordinate responses to developmental and environmental cues is of paramount importance to plant biology. Key to these regulatory processes are cis-regulatory modules (CRMs), which include enhancers and silencers. Despite the extraordinary advances in high-quality sequence assemblies and genome annotations, the identification and understanding of CRMs, and how they regulate gene expression, lag significantly behind. This is especially true for their distinguishing characteristics and activity states. Here, we review the current knowledge on CRMs and breakthrough technologies enabling identification, characterization, and validation of CRMs; we compare the genomic distributions of CRMs with respect to their target genes between different plant species, and discuss the role of transposable elements harboring CRMs in the evolution of gene expression. This is an exciting time to study cis-regulomes in plants; however, significant existing challenges need to be overcome to fully understand and appreciate the role of CRMs in plant biology and in crop improvement.
PMID:34918159 | DOI:10.1093/plcell/koab281
Mechanisms Underlying Hox-Mediated Transcriptional Outcomes
Front Cell Dev Biol. 2021 Nov 16;9:787339. doi: 10.3389/fcell.2021.787339. eCollection 2021.
ABSTRACT
Metazoans differentially express multiple Hox transcription factors to specify diverse cell fates along the developing anterior-posterior axis. Two challenges arise when trying to understand how the Hox transcription factors regulate the required target genes for morphogenesis: First, how does each Hox factor differ from one another to accurately activate and repress target genes required for the formation of distinct segment and regional identities? Second, how can a Hox factor that is broadly expressed in many tissues within a segment impact the development of specific organs by regulating target genes in a cell type-specific manner? In this review, we highlight how recent genomic, interactome, and cis-regulatory studies are providing new insights into answering these two questions. Collectively, these studies suggest that Hox factors may differentially modify the chromatin of gene targets as well as utilize numerous interactions with additional co-activators, co-repressors, and sequence-specific transcription factors to achieve accurate segment and cell type-specific transcriptional outcomes.
PMID:34869389 | PMC:PMC8635045 | DOI:10.3389/fcell.2021.787339
Chronic Pancreatitis: The True Pathogenic Culprit within the <em>SPINK1</em> N34S-Containing Haplotype Is No Longer at Large
Genes (Basel). 2021 Oct 23;12(11):1683. doi: 10.3390/genes12111683.
ABSTRACT
A diverse range of loss-of-function variants in the SPINK1 gene (encoding pancreatic secretory trypsin inhibitor) has been identified in patients with chronic pancreatitis (CP). The haplotype harboring the SPINK1 c.101A>G (p.Asn34Ser or N34S) variant (rs17107315:T>C) is one of the most important heritable risk factors for CP as a consequence of its relatively high prevalence worldwide (population allele frequency ≈ 1%) and its considerable effect size (odds ratio ≈ 11). The causal variant responsible for this haplotype has been intensively investigated over the past two decades. The different hypotheses tested addressed whether the N34S missense variant has a direct impact on enzyme structure and function, whether c.101A>G could affect pre-mRNA splicing or mRNA stability, and whether another variant in linkage disequilibrium with c.101A>G might be responsible for the observed association with CP. Having reviewed the currently available genetic and experimental data, we conclude that c.-4141G>T (rs142703147:C>A), which disrupts a PTF1L-binding site within an evolutionarily conserved HNF1A-PTF1L cis-regulatory module located ∼4 kb upstream of the SPINK1 promoter, can be designated as the causal variant beyond reasonable doubt. This case illustrates the difficulties inherent in determining the identity of the causal variant underlying an initially identified disease association.
PMID:34828289 | DOI:10.3390/genes12111683
ReMap 2022: a database of Human, Mouse, Drosophila and Arabidopsis regulatory regions from an integrative analysis of DNA-binding sequencing experiments
Nucleic Acids Res. 2021 Nov 9:gkab996. doi: 10.1093/nar/gkab996. Online ahead of print.
ABSTRACT
ReMap (https://remap.univ-amu.fr) aims to provide manually curated, high-quality catalogs of regulatory regions resulting from a large-scale integrative analysis of DNA-binding experiments in Human, Mouse, Fly and Arabidopsis thaliana for hundreds of transcription factors and regulators. In this 2022 update, we have uniformly processed >11 000 DNA-binding sequencing datasets from public sources across four species. The updated Human regulatory atlas includes 8103 datasets covering a total of 1210 transcriptional regulators (TRs) with a catalog of 182 million (M) peaks, while the updated Arabidopsis atlas reaches 4.8M peaks, 423 TRs across 694 datasets. Also, this ReMap release is enriched by two new regulatory catalogs for Mus musculus and Drosophila melanogaster. First, the Mouse regulatory catalog consists of 123M peaks across 648 TRs as a result of the integration and validation of 5503 ChIP-seq datasets. Second, the Drosophila melanogaster catalog contains 16.6M peaks across 550 TRs from the integration of 1205 datasets. The four regulatory catalogs are browsable through track hubs at UCSC, Ensembl and NCBI genome browsers. Finally, ReMap 2022 comes with a new Cis Regulatory Module identification method, improved quality controls, faster search results, and better user experience with an interactive tour and video tutorials on browsing and filtering ReMap catalogs.
PMID:34751401 | DOI:10.1093/nar/gkab996
The Power of Word-Frequency Based Alignment-Free Functions: a Comprehensive Large-Scale Experimental Analysis
Bioinformatics. 2021 Oct 30:btab747. doi: 10.1093/bioinformatics/btab747. Online ahead of print.
ABSTRACT
MOTIVATION: Alignment-free (AF) distance/similarity functions are a key tool for sequence analysis. Experimental studies on real datasets abound and, to some extent, there are also studies regarding their control of false positive rate (Type I error). However, assessment of their power, i.e., their ability to identify true similarity, has been limited to some members of the D2 family. The corresponding experimental studies have concentrated on short sequences, a scenario no longer adequate for current applications, where sequence lengths may vary considerably. Such a State of the Art is methodologically problematic, since information regarding a key feature such as power is either missing or limited.
RESULTS: By concentrating on a representative set of word-frequency based AF functions, we perform the first coherent and uniform evaluation of the power, involving also Type I error for completeness. Two Alternative models of important genomic features (CIS Regulatory Modules and Horizontal Gene Transfer), a wide range of sequence lengths from a few thousand to millions, and different values of k have been used. As a result, we provide a characterization of those AF functions that is novel and informative. Indeed, we identify weak and strong points of each function considered, which may be used as a guide to choose one for analysis tasks. Remarkably, of the fifteen functions that we have considered, only four stand out, with small differences between small and short sequence length scenarios. Finally, in order to encourage the use of our methodology for validation of future AF functions, the Big Data platform supporting it is public.
AVAILABILITY: The software is available at: https://github.com/pipp8/power_statistics.
SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
PMID:34718420 | DOI:10.1093/bioinformatics/btab747
Identification of limb-specific Lmx1b auto-regulatory modules with Nail-patella syndrome pathogenicity
Nat Commun. 2021 Sep 20;12(1):5533. doi: 10.1038/s41467-021-25844-5.
ABSTRACT
LMX1B haploinsufficiency causes Nail-patella syndrome (NPS; MIM 161200), characterized by nail dysplasia, absent/hypoplastic patellae, chronic kidney disease, and glaucoma. Accordingly in mice, Lmx1b has been shown to play crucial roles in the development of the limb, kidney and eye. Although one functional allele of Lmx1b appears adequate for development, Lmx1b null mice display ventral-ventral distal limbs with abnormal kidney, eye and cerebellar development, more disruptive, but fully concordant with NPS. In Lmx1b functional knockouts (KOs), Lmx1b transcription in the limb is decreased nearly 6-fold, indicating autoregulation. Herein, we report on two conserved Lmx1b-associated cis-regulatory modules (LARM1 and LARM2) that are bound by Lmx1b, amplify Lmx1b expression with unique spatial modularity in the limb, and are necessary for Lmx1b-mediated limb dorsalization. These enhancers, being conserved across vertebrates (including coelacanth, but not other fish species), and required for normal locomotion, provide a unique opportunity to study the role of dorsalization in the fin to limb transition. We also report on two NPS patient families with normal LMX1B coding sequence, but with loss-of-function variations in the LARM1/2 region, stressing the role of regulatory modules in disease pathogenesis.
PMID:34545091 | DOI:10.1038/s41467-021-25844-5
TaNAC100 acts as an integrator of seed protein and starch synthesis conferring pleiotropic effects on agronomic traits in wheat
Plant J. 2021 Sep 7. doi: 10.1111/tpj.15485. Online ahead of print.
ABSTRACT
High-molecular-weight glutenin subunits (HMW-GS) are major components of seed storage proteins (SSPs) and largely determine the processing properties of wheat flour. HMW-GS are encoded by the GLU-1 loci and regulated at the transcriptional level by interaction between cis-elements and transcription factors (TFs). We recently validated the function of conserved cis-regulatory modules (CCRMs) in GLU-1 promoters, but their interacting TFs remained uncharacterized. Here we identified a CCRM-binding NAC protein, TaNAC100, through yeast one hybrid (Y1H) library screening. Transactivation assays demonstrated that TaNAC100 could bind to the GLU-1 promoters and repress their transcription activity in tobacco. Overexpression of TaNAC100 in wheat significantly reduced the contents of HMW-GS and other SSPs as well as total seed proteins. This was confirmed by transcriptome analyses. Conversely, enhanced expression of TaNAC100 increased seed starch contents and expression of key starch synthesis-related genes, such as TaGBSS1 and TaSUS2. Y1H assays also indicated TaNAC100 binding with the promoters of TaGBSS1 and TaSUS2. These results suggest that TaNAC100 functions as a hub controlling seed protein and starch synthesis. Phenotypic analyses showed that TaNAC100 overexpression repressed plant height, increased heading date, and promoted seed size and thousand kernel weight. We also investigated sequence variations in a panel of cultivars population, but didn't identify significant association of TaNCA100 haplotypes with agronomic traits. The findings not only uncover a useful gene for wheat breeding but also provide an entry point to reveal the mechanism underlying metabolic balance of seed storage products.
PMID:34492155 | DOI:10.1111/tpj.15485
Intergenic spaces: A new frontier to improving plant health
New Phytol. 2021 Sep 3. doi: 10.1111/nph.17706. Online ahead of print.
ABSTRACT
To more sustainably mitigate the impact of crop diseases on plant health and productivity, there is a need for broader spectrum, long-lasting resistance traits. Defense Response (DR) genes, located throughout the genome, participate in cellular and system-wide defense mechanisms to stave off infection by diverse pathogens. This multigenic resistance avoids rapid evolution of a pathogen to overcome host resistance. DR genes reside within resistance-associated Quantitative Trait Loci (QTL), and alleles of DR genes in resistant varieties are more active during pathogen attack relative to susceptible haplotypes. Differential expression of DR genes results from polymorphisms in their regulatory regions, which include cis-regulatory elements such as transcription factor binding sites as well as features that influence epigenetic structural changes to modulate chromatin accessibility during infection. Many of these elements are found in clusters, known as cis-Regulatory Modules (CRMs), which are distributed throughout the host genome. Regulatory regions involved in plant-pathogen interactions may also contain pathogen effector binding elements that regulate DR gene expression, and that, when mutated, result in a change in the plants' response. We posit that CRMs and the multiple regulatory elements that comprise them are potential targets for marker-assisted breeding for broad-spectrum, durable disease resistance.
PMID:34478160 | DOI:10.1111/nph.17706
Linking the <em>FTO</em> obesity rs1421085 variant circuitry to cellular, metabolic, and organismal phenotypes in vivo
Sci Adv. 2021 Jul 21;7(30):eabg0108. doi: 10.1126/sciadv.abg0108. Print 2021 Jul.
ABSTRACT
Variants in FTO have the strongest association with obesity; however, it is still unclear how those noncoding variants mechanistically affect whole-body physiology. We engineered a deletion of the rs1421085 conserved cis-regulatory module (CRM) in mice and confirmed in vivo that the CRM modulates Irx3 and Irx5 gene expression and mitochondrial function in adipocytes. The CRM affects molecular and cellular phenotypes in an adipose depot-dependent manner and affects organismal phenotypes that are relevant for obesity, including decreased high-fat diet-induced weight gain, decreased whole-body fat mass, and decreased skin fat thickness. Last, we connected the CRM to a genetically determined effect on steroid patterns in males that was dependent on nutritional challenge and conserved across mice and humans. Together, our data establish cross-species conservation of the rs1421085 regulatory circuitry at the molecular, cellular, metabolic, and organismal level, revealing previously unknown contextual dependence of the variant's action.
PMID:34290091 | DOI:10.1126/sciadv.abg0108
Annotating the Insect Regulatory Genome
Insects. 2021 Jun 29;12(7):591. doi: 10.3390/insects12070591.
ABSTRACT
An ever-growing number of insect genomes is being sequenced across the evolutionary spectrum. Comprehensive annotation of not only genes but also regulatory regions is critical for reaping the full benefits of this sequencing. Driven by developments in sequencing technologies and in both empirical and computational discovery strategies, the past few decades have witnessed dramatic progress in our ability to identify cis-regulatory modules (CRMs), sequences such as enhancers that play a major role in regulating transcription. Nevertheless, providing a timely and comprehensive regulatory annotation of newly sequenced insect genomes is an ongoing challenge. We review here the methods being used to identify CRMs in both model and non-model insect species, and focus on two tools that we have developed, REDfly and SCRMshaw. These resources can be paired together in a powerful combination to facilitate insect regulatory annotation over a broad range of species, with an accuracy equal to or better than that of other state-of-the-art methods.
PMID:34209769 | DOI:10.3390/insects12070591
Allelic diversification after transposable element exaptation promoted Gsdf as the master sex determining gene of sablefish
Genome Res. 2021 Jun 28:gr.274266.120. doi: 10.1101/gr.274266.120. Online ahead of print.
ABSTRACT
Concepts of evolutionary biology suggest that morphological change may occur by rare punctual, but rather large changes, or by more steady and gradual transformations. It can therefore be asked whether genetic changes underlying morphological, physiological, and/or behavioral innovations during evolution occur in a punctual manner, whereby a single mutational event has prominent phenotypic consequences, or if many consecutive alterations in the DNA over longer time periods lead to phenotypic divergence. In the marine teleost, sablefish (Anoplopoma fimbria), complementary genomic and genetic studies led to the identification of a sex locus on the Y Chromosome. Further characterization of this locus resulted in identification of the transforming growth factor (tgfbr1a) gene, gonadal soma-derived factor (gsdf), as the main candidate for fulfilling the master sex determining (MSD) function. The presence of different X and Y Chromosome copies of this gene indicated that the male heterogametic (XY) system of sex determination in sablefish arose by allelic diversification. The gsdfY gene has a spatio-temporal expression profile characteristic of a male MSD gene. We provide experimental evidence demonstrating a pivotal role of a transposable element (TE) for the divergent function of gsdfY By insertion within the gsdfY promoter region, this TE generated allelic diversification by bringing cis-regulatory modules that led to transcriptional rewiring and thus creation of a new MSD gene. This points out for the first time in the scenario of MSD gene evolution by allelic diversification, a single, punctual molecular event in the appearance of a new trigger for male development.
PMID:34183453 | DOI:10.1101/gr.274266.120
UniBind: maps of high-confidence direct TF-DNA interactions across nine species
BMC Genomics. 2021 Jun 26;22(1):482. doi: 10.1186/s12864-021-07760-6.
ABSTRACT
BACKGROUND: Transcription factors (TFs) bind specifically to TF binding sites (TFBSs) at cis-regulatory regions to control transcription. It is critical to locate these TF-DNA interactions to understand transcriptional regulation. Efforts to predict bona fide TFBSs benefit from the availability of experimental data mapping DNA binding regions of TFs (chromatin immunoprecipitation followed by sequencing - ChIP-seq).
RESULTS: In this study, we processed ~ 10,000 public ChIP-seq datasets from nine species to provide high-quality TFBS predictions. After quality control, it culminated with the prediction of ~ 56 million TFBSs with experimental and computational support for direct TF-DNA interactions for 644 TFs in > 1000 cell lines and tissues. These TFBSs were used to predict > 197,000 cis-regulatory modules representing clusters of binding events in the corresponding genomes. The high-quality of the TFBSs was reinforced by their evolutionary conservation, enrichment at active cis-regulatory regions, and capacity to predict combinatorial binding of TFs. Further, we confirmed that the cell type and tissue specificity of enhancer activity was correlated with the number of TFs with binding sites predicted in these regions. All the data is provided to the community through the UniBind database that can be accessed through its web-interface ( https://unibind.uio.no/ ), a dedicated RESTful API, and as genomic tracks. Finally, we provide an enrichment tool, available as a web-service and an R package, for users to find TFs with enriched TFBSs in a set of provided genomic regions.
CONCLUSIONS: UniBind is the first resource of its kind, providing the largest collection of high-confidence direct TF-DNA interactions in nine species.
PMID:34174819 | DOI:10.1186/s12864-021-07760-6
Analysis of gene network bifurcation during optic cup morphogenesis in zebrafish
Nat Commun. 2021 Jun 23;12(1):3866. doi: 10.1038/s41467-021-24169-7.
ABSTRACT
Sight depends on the tight cooperation between photoreceptors and pigmented cells, which derive from common progenitors through the bifurcation of a single gene regulatory network into the neural retina (NR) and retinal-pigmented epithelium (RPE) programs. Although genetic studies have identified upstream nodes controlling these networks, their regulatory logic remains poorly investigated. Here, we characterize transcriptome dynamics and chromatin accessibility in segregating NR/RPE populations in zebrafish. We analyze cis-regulatory modules and enriched transcription factor motives to show extensive network redundancy and context-dependent activity. We identify downstream targets, highlighting an early recruitment of desmosomal genes in the flattening RPE and revealing Tead factors as upstream regulators. We investigate the RPE specification network dynamics to uncover an unexpected sequence of transcription factors recruitment, which is conserved in humans. This systematic interrogation of the NR/RPE bifurcation should improve both genetic counseling for eye disorders and hiPSCs-to-RPE differentiation protocols for cell-replacement therapies in degenerative diseases.
PMID:34162866 | DOI:10.1038/s41467-021-24169-7