Drug-induced Adverse Events

ModelView for ModelDB: Online Presentation of Model Structure.
ModelView for ModelDB: Online Presentation of Model Structure.
Neuroinformatics. 2015 Oct;13(4):459-70
Authors: McDougal RA, Morse TM, Hines ML, Shepherd GM
Abstract
ModelDB ( modeldb.yale.edu ), a searchable repository of source code of more than 950 published computational neuroscience models, seeks to promote model reuse and reproducibility. Code sharing is a first step; however, model source code is often large and not easily understood. To aid users, we have developed ModelView, a web application for ModelDB that presents a graphical view of model structure augmented with contextual information for NEURON and NEURON-runnable (e.g. NeuroML, PyNN) models. Web presentation provides a rich, simulator-independent environment for interacting with graphs. The necessary data is generated by combining manual curation, text-mining the source code, querying ModelDB, and simulator introspection. Key features of the user interface along with the data analysis, storage, and visualization algorithms are explained. With this tool, researchers can examine and assess the structure of hundreds of models in ModelDB in a standardized presentation without installing any software, downloading the model, or reading model source code.
PMID: 25896640 [PubMed - indexed for MEDLINE]
Digitisation, Big Data, and the Future of the Medical Humanities: Text-Mining and the History of Medicine: Big Data, Big Questions?
Digitisation, Big Data, and the Future of the Medical Humanities: Text-Mining and the History of Medicine: Big Data, Big Questions?
Med Hist. 2016 Apr;60(2):294-6
Authors: Toon E, Timmermann C, Worboys M
PMID: 26971613 [PubMed - indexed for MEDLINE]
Literature-Informed Analysis of a Genome-Wide Association Study of Gestational Age in Norwegian Women and Children Suggests Involvement of Inflammatory Pathways.
Literature-Informed Analysis of a Genome-Wide Association Study of Gestational Age in Norwegian Women and Children Suggests Involvement of Inflammatory Pathways.
PLoS One. 2016;11(8):e0160335
Authors: Bacelis J, Juodakis J, Sengpiel V, Zhang G, Myhre R, Muglia LJ, Nilsson S, Jacobsson B
Abstract
BACKGROUND: Five-to-eighteen percent of pregnancies worldwide end in preterm birth, which is the major cause of neonatal death and morbidity. Approximately 30% of the variation in gestational age at birth can be attributed to genetic factors. Genome-wide association studies (GWAS) have not shown robust evidence of association with genomic loci yet.
METHODS: We separately investigated 1921 Norwegian mothers and 1199 children from pregnancies with spontaneous onset of delivery. Individuals were further divided based on the onset of delivery: initiated by labor or prelabor rupture of membranes. Genetic association with ultrasound-dated gestational age was evaluated using three genetic models and adaptive permutations. The top-ranked loci were tested for enrichment in 12 candidate gene-sets generated by text-mining PubMed abstracts containing pregnancy-related keywords.
RESULTS: The six GWAS did not reveal significant associations, with the most extreme empirical p = 5.1 × 10-7. The top loci from maternal GWAS with deliveries initiated by labor showed significant enrichment in 10 PubMed gene-sets, e.g., p = 0.001 and 0.005 for keywords "uterus" and "preterm" respectively. Enrichment signals were mainly caused by infection/inflammation-related genes TLR4, NFKB1, ABCA1, MMP9. Literature-informed analysis of top loci revealed further immunity genes: IL1A, IL1B, CAMP, TREM1, TFRC, NFKBIA, MEFV, IRF8, WNT5A.
CONCLUSION: Our analyses support the role of inflammatory pathways in determining pregnancy duration and provide a list of 32 candidate genes for a follow-up work. We observed that the top regions from GWAS in mothers with labor-initiated deliveries significantly more often overlap with pregnancy-related genes than would be expected by chance, suggesting that increased sample size would benefit similar studies.
PMID: 27490719 [PubMed - as supplied by publisher]
Erratum to: A novel procedure on next generation sequencing data analysis using text mining algorithm.
Erratum to: A novel procedure on next generation sequencing data analysis using text mining algorithm.
BMC Bioinformatics. 2016;17(1):301
Authors: Zhao W, Chen JJ, Perkins R, Wang Y, Liu Z, Hong H, Tong W, Zou W
PMID: 27489012 [PubMed - as supplied by publisher]
Systems Pharmacology Dissection of the Protective Effect of Myricetin Against Acute Ischemia/Reperfusion-Induced Myocardial Injury in Isolated Rat Heart.
Systems Pharmacology Dissection of the Protective Effect of Myricetin Against Acute Ischemia/Reperfusion-Induced Myocardial Injury in Isolated Rat Heart.
Cardiovasc Toxicol. 2016 Aug 2;
Authors: Qiu Y, Cong N, Liang M, Wang Y, Wang J
Abstract
In this paper, we investigated the multi-target effect of myricetin as a therapeutic for cardiovascular disease, using an acute ischemia/reperfusion-induced myocardial injury model to gain insight into its mechanism of action. The compound-target interaction profiles of myricetin were determined using a combination of text mining, chemometric and chemogenomic methods. The effect of myricetin on cardiac function was investigated by carrying out experiments in rats subjected to ischemia/reperfusion (I/R) using Langendorff retrograde perfusion technology. Compared to the I/R group, pretreatment with 5 μM myricetin was observed to improve the maximum up/down rate of left ventricular pressure (dp/dt max) and coronary flow, raise left ventricular developed pressure, and decrease creatine kinase and lactate dehydrogenase levels in coronary flow. In addition, myricetin treatment was shown to have beneficial effects through its ability to reduce both infarct size and levels of cardiomyocyte apoptosis. Myricetin was also observed to have antioxidant properties, as evidenced by its ability to reduce MDA levels, while increasing both SOD levels and the GSH/GSSG ratio. Finally, an upregulation of 6-phosphogluconate dehydrogenase and fatty acid synthase expression and a downregulation of cyclooxygenase-2, cytochrome P450 and p38 mitogen-activated protein kinase expression suggest that myricetin acts through mechanisms which alter relevant signaling pathways. In summary, our results demonstrate that myricetin has protective cardiovascular effects against I/R-induced myocardial injury.
PMID: 27484498 [PubMed - as supplied by publisher]
iLIR database: a web resource for LIR motif-containing proteins in eukaryotes.
iLIR database: a web resource for LIR motif-containing proteins in eukaryotes.
Autophagy. 2016 Aug 2;:0
Authors: Jacomin AC, Samavedam S, Promponas V, Nezis IP
Abstract
Atg8-family proteins are the best-studied proteins of the core autophagic machinery. They are essential for the elongation and closure of the phagophore into a proper autophagosome. Moreover, Atg8-family proteins are associated with the phagophore from the initiation of the autophagic process to, or just prior to, the fusion between autophagosomes with lysosomes. In addition to their implication in autophagosome biogenesis, they are crucial for selective autophagy through their ability to interact with selective autophagy receptor proteins necessary for the specific targeting of substrates for autophagic degradation. In the last few years it has been revealed that Atg8-interacting proteins include not only receptors but also components of the core autophagic machinery, proteins associated with vesicles and their transport, and specific proteins that are selectively degraded by autophagy. Atg8-interacting proteins contain a short linear LC3-interacting region/LC3 recognition sequence/Atg8-interacting motif (LIR/LRS/AIM) motif which is responsible for their interaction with Atg8-family proteins. These proteins are referred as LIR-containing proteins (LIRCPs). So far, many experimental efforts have been carried out to identify new LIRCPs, leading to the characterization of some of them in the past 10 years. Given the need for the identification of LIRCPs in various organisms, we developed the iLIR database ( https://ilir.warwick.ac.uk ) as a freely available web resource, listing all the putative canonical LIRCPs identified in silico in the proteomes of 8 model organisms using the iLIR server, combined with a Gene Ontology (GO) term analysis. Additionally, a curated text-mining analysis of the literature permitted us to identify novel putative LICRPs in mammals that have not previously been associated with autophagy.
PMID: 27484196 [PubMed - as supplied by publisher]
METSP: a maximum-entropy classifier based text mining tool for transporter-substrate identification with semistructured text.
METSP: a maximum-entropy classifier based text mining tool for transporter-substrate identification with semistructured text.
Biomed Res Int. 2015;2015:254838
Authors: Zhao M, Chen Y, Qu D, Qu H
Abstract
The substrates of a transporter are not only useful for inferring function of the transporter, but also important to discover compound-compound interaction and to reconstruct metabolic pathway. Though plenty of data has been accumulated with the developing of new technologies such as in vitro transporter assays, the search for substrates of transporters is far from complete. In this article, we introduce METSP, a maximum-entropy classifier devoted to retrieve transporter-substrate pairs (TSPs) from semistructured text. Based on the high quality annotation from UniProt, METSP achieves high precision and recall in cross-validation experiments. When METSP is applied to 182,829 human transporter annotation sentences in UniProt, it identifies 3942 sentences with transporter and compound information. Finally, 1547 confidential human TSPs are identified for further manual curation, among which 58.37% pairs with novel substrates not annotated in public transporter databases. METSP is the first efficient tool to extract TSPs from semistructured annotation text in UniProt. This tool can help to determine the precise substrates and drugs of transporters, thus facilitating drug-target prediction, metabolic network reconstruction, and literature classification.
PMID: 26495291 [PubMed - indexed for MEDLINE]
Diagnostic utility of droplet digital PCR for HIV reservoir quantification.
Diagnostic utility of droplet digital PCR for HIV reservoir quantification.
J Virus Erad. 2016;2(3):162-9
Authors: Trypsteen W, Kiselinova M, Vandekerckhove L, De Spiegelaere W
Abstract
Quantitative real-time PCR (qPCR) is implemented in many molecular laboratories worldwide for the quantification of viral nucleic acids. However, over the last two decades, there has been renewed interest in the concept of digital PCR (dPCR) as this platform offers direct quantification without the need for standard curves, a simplified workflow and the possibility to extend the current detection limit. These benefits are of great interest in terms of the quantification of low viral levels in HIV reservoir research because changes in the dynamics of residual HIV reservoirs will be important to monitor HIV cure efforts. Here, we have implemented a systematic literature screening and text mining approach to map the use of droplet dPCR (ddPCR) in the context of HIV quantification. In addition, several technical aspects of ddPCR were compared with qPCR: accuracy, sensitivity, precision and reproducibility, to determine its diagnostic utility. We have observed that ddPCR was used in different body compartments in multiple HIV-1 and HIV-2 assays, with the majority of reported assays focusing on HIV-1 DNA-based applications (i.e. total HIV DNA). Furthermore, ddPCR showed a higher accuracy, precision and reproducibility, but similar sensitivity when compared to qPCR due to reported false positive droplets in the negative template controls with a need for standardised data analysis (i.e. threshold determination). In the context of a low level of detection and HIV reservoir diagnostics, ddPCR can offer a valid alternative to qPCR-based assays but before this platform can be clinically accredited, some remaining issues need to be resolved.
PMID: 27482456 [PubMed]
Gender differences in cancer susceptibility: role of oxidative stress.
Gender differences in cancer susceptibility: role of oxidative stress.
Carcinogenesis. 2016 Jul 31;
Authors: Ali I, Högberg J, Hsieh JH, Auerbach S, Korhonen A, Stenius U, Silins I
Abstract
Cancer is a leading cause of death worldwide and environmental factors, including chemicals, have been suggested as major etiological incitements. Cancer statistics indicates that men get more cancer than women. However, differences in the known risk factors including life-style or occupational exposure only offer partial explanation. Using a text mining tool, we have investigated the scientific literature concerning male- and female-specific rat carcinogens that induced tumors only in one gender in NTP 2-year cancer bioassay. Our evaluation shows that oxidative stress, although frequently reported for both male- and female-specific rat carcinogens, was mentioned significantly more in literature concerning male-specific rat carcinogens. Literature analysis of testosterone and estradiol showed the same pattern. Tox21 high-throughput assay results, although showing only weak association of oxidative stress-related processes for male- and female-specific rat carcinogens, provide additional support. We also analyzed the literature concerning 26 established human carcinogens (IARC group 1). Oxidative stress was more frequently reported for the majority of these carcinogens, and the Tox21 data resembled that of male-specific rat carcinogens. Thus, our data, based on about 600,000 scientific abstracts and Tox21 screening assays, suggest a link between male-specific carcinogens, testosterone and oxidative stress. This implies that a different cellular response to oxidative stress in men and women may be a critical factor in explaining the greater cancer susceptibility observed in men. Although the IARC carcinogens are classified as human carcinogens, their classification largely based on epidemiological evidence from male cohorts, which raises the question whether carcinogen classifications should be gender specific.
PMID: 27481070 [PubMed - as supplied by publisher]
Gene-Disease Interaction Retrieval from Multiple Sources: A Network Based Method.
Gene-Disease Interaction Retrieval from Multiple Sources: A Network Based Method.
Biomed Res Int. 2016;2016:3594517
Authors: Huang L, Wang Y, Wang Y, Bai T
Abstract
The number of gene-related databases has been growing largely along with the research on genes of bioinformatics. Those databases are filled with various gene functions, pathways, interactions, and so forth, while much biomedical knowledge about human diseases is stored as text in all kinds of literatures. Researchers have developed many methods to extract structured biomedical knowledge. Some study and improve text mining algorithms to achieve efficiency in order to cover as many data sources as possible, while some build open source database to accept individual submissions in order to achieve accuracy. This paper combines both efforts and biomedical ontologies to build an interaction network of multiple biomedical ontologies, which guarantees its robustness as well as its wide coverage of biomedical publications. Upon the network, we accomplish an algorithm which discovers paths between concept pairs and shows potential relations.
PMID: 27478829 [PubMed - in process]
Decision Support Environment for Medical Product Safety Surveillance.
Decision Support Environment for Medical Product Safety Surveillance.
J Biomed Inform. 2016 Jul 28;
Authors: Botsis T, Jankosky C, Arya D, Kreimeyer K, Foster M, Pandey A, Wang W, Zhang G, Forshee R, Goud R, Menschik D, Walderhaug M, Woo EJ, Scott J
Abstract
We have developed a Decision Support Environment (DSE) for medical experts at the US Food and Drug Administration (FDA). The DSE contains two integrated systems: The Event-based Text-mining of Health Electronic Records (ETHER) and the Pattern-based and Advanced Network Analyzer for Clinical Evaluation and Assessment (PANACEA). These systems assist medical experts in reviewing reports submitted to the Vaccine Adverse Event Reporting System (VAERS) and the FDA Adverse Event Reporting System (FAERS). In this manuscript, we describe the DSE architecture and key functionalities, and examine its potential contributions to the signal management process by focusing on four use cases: the identification of missing cases from a case series, the identification of duplicate case reports, retrieving cases for a case series analysis, and community detection for signal identification and characterization.
PMID: 27477839 [PubMed - as supplied by publisher]
IPAT: a freely accessible software tool for analyzing multiple patent documents with inbuilt landscape visualizer.
IPAT: a freely accessible software tool for analyzing multiple patent documents with inbuilt landscape visualizer.
Pharm Pat Anal. 2015;4(5):377-86
Authors: Ajay D, Gangwal RP, Sangamwar AT
Abstract
Intelligent Patent Analysis Tool (IPAT) is an online data retrieval tool, operated based on text mining algorithm to extract specific patent information in a predetermined pattern into an Excel sheet. The software is designed and developed to retrieve and analyze technology information from multiple patent documents and generate various patent landscape graphs and charts. The software is C# coded in visual studio 2010, which extracts the publicly available patent information from the web pages like Google Patent and simultaneously study the various technology trends based on user-defined parameters. In other words, IPAT combined with the manual categorization will act as an excellent technology assessment tool in competitive intelligence and due diligence for predicting the future R&D forecast.
PMID: 26452016 [PubMed - indexed for MEDLINE]
Drug Drug Interaction Extraction from Biomedical Literature Using Syntax Convolutional Neural Network.
Drug Drug Interaction Extraction from Biomedical Literature Using Syntax Convolutional Neural Network.
Bioinformatics. 2016 Jul 27;
Authors: Zhao Z, Yang Z, Luo L, Lin H, Wang J
Abstract
MOTIVATION: Detecting drug-drug interaction (DDI) has become a vital part of public health safety. Therefore, using text mining techniques to extract DDIs from biomedical literature has received great attentions. However, this research is still at an early stage and its performance has much room to improve.
RESULTS: In this paper, we present a syntax convolutional neural network (SCNN) based DDI extraction method. In this method, a novel word embedding, syntax word embedding, is proposed to employ the syntactic information of a sentence. Then the position and part of speech (POS) features are introduced to extend the embedding of each word. Later, auto-encoder is introduced to encode the traditional bag-of-words feature (sparse 0-1 vector) as the dense real value vector. Finally, a combination of embedding-based convolutional features and traditional features are fed to the softmax classifier to extract DDIs from biomedical literature. Experimental results on the DDIExtraction 2013 corpus show that SCNN obtains a better performance (an F-score of 0.686) than other state-of-the-art methods.
AVAILABILITY: The source code is available for academic use at http://202.118.75.18:8080/DDI/SCNN-DDI.zip CONTACT: yangzh@dlut.edu.cn SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
PMID: 27466626 [PubMed - as supplied by publisher]
Establishing a baseline for literature mining human genetic variants and their relationships to disease cohorts.
Establishing a baseline for literature mining human genetic variants and their relationships to disease cohorts.
BMC Med Inform Decis Mak. 2016;16 Suppl 1:68
Authors: Verspoor KM, Heo GE, Kang KY, Song M
Abstract
BACKGROUND: The Variome corpus, a small collection of published articles about inherited colorectal cancer, includes annotations of 11 entity types and 13 relation types related to the curation of the relationship between genetic variation and disease. Due to the richness of these annotations, the corpus provides a good testbed for evaluation of biomedical literature information extraction systems.
METHODS: In this paper, we focus on assessing performance on extracting the relations in the corpus, using gold standard entities as a starting point, to establish a baseline for extraction of relations important for extraction of genetic variant information from the literature. We test the application of the Public Knowledge Discovery Engine for Java (PKDE4J) system, a natural language processing system designed for information extraction of entities and relations in text, on the relation extraction task using this corpus.
RESULTS: For the relations which are attested at least 100 times in the Variome corpus, we realise a performance ranging from 0.78-0.84 Precision-weighted F-score, depending on the relation. We find that the PKDE4J system adapted straightforwardly to the range of relation types represented in the corpus; some extensions to the original methodology were required to adapt to the multi-relational classification context. The results are competitive with state-of-the-art relation extraction performance on more heavily studied corpora, although the analysis shows that the Recall of a co-occurrence baseline outweighs the benefit of improved Precision for many relations, indicating the value of simple semantic constraints on relations.
CONCLUSIONS: This work represents the first attempt to apply relation extraction methods to the Variome corpus. The results demonstrate that automated methods have good potential to structure the information expressed in the published literature related to genetic variants, connecting mutations to genes, diseases, and patient cohorts. Further development of such approaches will facilitate more efficient biocuration of genetic variant information into structured databases, leveraging the knowledge embedded in the vast publication literature.
PMID: 27454860 [PubMed - in process]
Protein-protein interaction extraction with feature selection by evaluating contribution levels of groups consisting of related features.
Protein-protein interaction extraction with feature selection by evaluating contribution levels of groups consisting of related features.
BMC Bioinformatics. 2016;17 Suppl 7:246
Authors: Thuy Phan TT, Ohkawa T
Abstract
BACKGROUND: Protein-protein interaction (PPI) extraction from published scientific articles is one key issue in biological research due to its importance in grasping biological processes. Despite considerable advances of recent research in automatic PPI extraction from articles, demand remains to enhance the performance of the existing methods.
RESULTS: Our feature-based method incorporates the strength of many kinds of diverse features, such as lexical and word context features derived from sentences, syntactic features derived from parse trees, and features using existing patterns to extract PPIs automatically from articles. Among these abundant features, we assemble the related features into four groups and define the contribution level (CL) for each group, which consists of related features. Our method consists of two steps. First, we divide the training set into subsets based on the structure of the sentence and the existence of significant keywords (SKs) and apply the sentence patterns given in advance to each subset. Second, we automatically perform feature selection based on the CL values of the four groups that consist of related features and the k-nearest neighbor algorithm (k-NN) through three approaches: (1) focusing on the group with the best contribution level (BEST1G); (2) unoptimized combination of three groups with the best contribution levels (U3G); (3) optimized combination of two groups with the best contribution levels (O2G).
CONCLUSIONS: Our method outperforms other state-of-the-art PPI extraction systems in terms of F-score on the HPRD50 corpus and achieves promising results that are comparable with these PPI extraction systems on other corpora. Further, our method always obtains the best F-score on all the corpora than when using k-NN only without exploiting the CLs of the groups of related features.
PMID: 27454611 [PubMed - in process]
CLASH: Complementary Linkage with Anchoring and Scoring for Heterogeneous biomolecular and clinical data.
CLASH: Complementary Linkage with Anchoring and Scoring for Heterogeneous biomolecular and clinical data.
BMC Med Inform Decis Mak. 2016;16 Suppl 3:72
Authors: Nam Y, Kim M, Lee K, Shin H
Abstract
BACKGROUND: The study on disease-disease association has been increasingly viewed and analyzed as a network, in which the connections between diseases are configured using the source information on interactome maps of biomolecules such as genes, proteins, metabolites, etc. Although abundance in source information leads to tighter connections between diseases in the network, for a certain group of diseases, such as metabolic diseases, the connections do not occur much due to insufficient source information; a large proportion of their associated genes are still unknown. One way to circumvent the difficulties in the lack of source information is to integrate available external information by using one of up-to-date integration or fusion methods. However, if one wants a disease network placing huge emphasis on the original source of data but still utilizing external sources only to complement it, integration may not be pertinent. Interpretation on the integrated network would be ambiguous: meanings conferred on edges would be vague due to fused information.
METHODS: In this study, we propose a network based algorithm that complements the original network by utilizing external information while preserving the network's originality. The proposed algorithm links the disconnected node to the disease network by using complementary information from external data source through four steps: anchoring, connecting, scoring, and stopping.
RESULTS: When applied to the network of metabolic diseases that is sourced from protein-protein interaction data, the proposed algorithm recovered connections by 97%, and improved the AUC performance up to 0.71 (lifted from 0.55) by using the external information outsourced from text mining results on PubMed comorbidity literatures. Experimental results also show that the proposed algorithm is robust to noisy external information.
CONCLUSION: This research has novelty in which the proposed algorithm preserves the network's originality, but at the same time, complements it by utilizing external information. Furthermore it can be utilized for original association recovery and novel association discovery for disease network.
PMID: 27454118 [PubMed - in process]
Reflection of successful anticancer drug development processes in the literature.
Reflection of successful anticancer drug development processes in the literature.
Drug Discov Today. 2016 Jul 18;
Authors: Heinemann F, Huber T, Meisel C, Bundschus M, Leser U
Abstract
The development of cancer drugs is time-consuming and expensive. In particular, failures in late-stage clinical trials are a major cost driver for pharmaceutical companies. This puts a high demand on methods that provide insights into the success chances of new potential medicines. In this study, we systematically analyze publication patterns emerging along the drug discovery process of targeted cancer therapies, starting from basic research to drug approval-or failure. We find clear differences in the patterns of approved drugs compared with those that failed in Phase II/III. Feeding these features into a machine learning classifier allows us to predict the approval or failure of a targeted cancer drug significantly better than educated guessing. We believe that these findings could lead to novel measures for supporting decision making in drug development.
PMID: 27443674 [PubMed - as supplied by publisher]
A literature-driven method to calculate similarities among diseases.
A literature-driven method to calculate similarities among diseases.
Comput Methods Programs Biomed. 2015 Nov;122(2):108-22
Authors: Kim H, Yoon Y, Ahn J, Park S
Abstract
BACKGROUND: "Our lives are connected by a thousand invisible threads and along these sympathetic fibers, our actions run as causes and return to us as results". It is Herman Melville's famous quote describing connections among human lives. To paraphrase the Melville's quote, diseases are connected by many functional threads and along these sympathetic fibers, diseases run as causes and return as results. The Melville's quote explains the reason for researching disease-disease similarity and disease network. Measuring similarities between diseases and constructing disease network can play an important role in disease function research and in disease treatment. To estimate disease-disease similarities, we proposed a novel literature-based method.
METHODS AND RESULTS: The proposed method extracted disease-gene relations and disease-drug relations from literature and used the frequencies of occurrence of the relations as features to calculate similarities among diseases. We also constructed disease network with top-ranking disease pairs from our method. The proposed method discovered a larger number of answer disease pairs than other comparable methods and showed the lowest p-value.
CONCLUSIONS: We presume that our method showed good results because of using literature data, using all possible gene symbols and drug names for features of a disease, and determining feature values of diseases with the frequencies of co-occurrence of two entities. The disease-disease similarities from the proposed method can be used in computational biology researches which use similarities among diseases.
PMID: 26212477 [PubMed - indexed for MEDLINE]
Expert-Guided Generative Topographical Modeling with Visual to Parametric Interaction.
Expert-Guided Generative Topographical Modeling with Visual to Parametric Interaction.
PLoS One. 2016;11(2):e0129122
Authors: Han C, House L, Leman SC
Abstract
Introduced by Bishop et al. in 1996, Generative Topographic Mapping (GTM) is a powerful nonlinear latent variable modeling approach for visualizing high-dimensional data. It has shown useful when typical linear methods fail. However, GTM still suffers from drawbacks. Its complex parameterization of data make GTM hard to fit and sensitive to slight changes in the model. For this reason, we extend GTM to a visual analytics framework so that users may guide the parameterization and assess the data from multiple GTM perspectives. Specifically, we develop the theory and methods for Visual to Parametric Interaction (V2PI) with data using GTM visualizations. The result is a dynamic version of GTM that fosters data exploration. We refer to the new version as V2PI-GTM. In this paper, we develop V2PI-GTM in stages and demonstrate its benefits within the context of a text mining case study.
PMID: 26905728 [PubMed - indexed for MEDLINE]
DESM: portal for microbial knowledge exploration systems.
DESM: portal for microbial knowledge exploration systems.
Nucleic Acids Res. 2016 Jan 4;44(D1):D624-33
Authors: Salhi A, Essack M, Radovanovic A, Marchand B, Bougouffa S, Antunes A, Simoes MF, Lafi FF, Motwalli OA, Bokhari A, Malas T, Amoudi SA, Othum G, Allam I, Mineta K, Gao X, Hoehndorf R, C Archer JA, Gojobori T, Bajic VB
Abstract
Microorganisms produce an enormous variety of chemical compounds. It is of general interest for microbiology and biotechnology researchers to have means to explore information about molecular and genetic basis of functioning of different microorganisms and their ability for bioproduction. To enable such exploration, we compiled 45 topic-specific knowledgebases (KBs) accessible through DESM portal (www.cbrc.kaust.edu.sa/desm). The KBs contain information derived through text-mining of PubMed information and complemented by information data-mined from various other resources (e.g. ChEBI, Entrez Gene, GO, KOBAS, KEGG, UniPathways, BioGrid). All PubMed records were indexed using 4,538,278 concepts from 29 dictionaries, with 1 638 986 records utilized in KBs. Concepts used are normalized whenever possible. Most of the KBs focus on a particular type of microbial activity, such as production of biocatalysts or nutraceuticals. Others are focused on specific categories of microorganisms, e.g. streptomyces or cyanobacteria. KBs are all structured in a uniform manner and have a standardized user interface. Information exploration is enabled through various searches. Users can explore statistically most significant concepts or pairs of concepts, generate hypotheses, create interactive networks of associated concepts and export results. We believe DESM will be a useful complement to the existing resources to benefit microbiology and biotechnology research.
PMID: 26546514 [PubMed - indexed for MEDLINE]