Drug-induced Adverse Events

Use of cost-effectiveness analysis to compare the efficiency of study identification methods in systematic reviews.
Use of cost-effectiveness analysis to compare the efficiency of study identification methods in systematic reviews.
Syst Rev. 2016;5(1):140
Authors: Shemilt I, Khan N, Park S, Thomas J
Abstract
BACKGROUND: Meta-research studies investigating methods, systems, and processes designed to improve the efficiency of systematic review workflows can contribute to building an evidence base that can help to increase value and reduce waste in research. This study demonstrates the use of an economic evaluation framework to compare the costs and effects of four variant approaches to identifying eligible studies for consideration in systematic reviews.
METHODS: A cost-effectiveness analysis was conducted using a basic decision-analytic model, to compare the relative efficiency of 'safety first', 'double screening', 'single screening' and 'single screening with text mining' approaches in the title-abstract screening stage of a 'case study' systematic review about undergraduate medical education in UK general practice settings. Incremental cost-effectiveness ratios (ICERs) were calculated as the 'incremental cost per citation 'saved' from inappropriate exclusion' from the review. Resource use and effect parameters were estimated based on retrospective analysis of 'review process' meta-data curated alongside the 'case study' review, in conjunction with retrospective simulation studies to model the integrated use of text mining. Unit cost parameters were estimated based on the 'case study' review's project budget. A base case analysis was conducted, with deterministic sensitivity analyses to investigate the impact of variations in values of key parameters.
RESULTS: Use of 'single screening with text mining' would have resulted in title-abstract screening workload reductions (base case analysis) of >60 % compared with other approaches. Across modelled scenarios, the 'safety first' approach was, consistently, equally effective and less costly than conventional 'double screening'. Compared with 'single screening with text mining', estimated ICERs for the two non-dominated approaches (base case analyses) ranged from £1975 ('single screening' without a 'provisionally included' code) to £4427 ('safety first' with a 'provisionally included' code) per citation 'saved'. Patterns of results were consistent between base case and sensitivity analyses.
CONCLUSIONS: Alternatives to the conventional 'double screening' approach, integrating text mining, warrant further consideration as potentially more efficient approaches to identifying eligible studies for systematic reviews. Comparable economic evaluations conducted using other systematic review datasets are needed to determine the generalisability of these findings and to build an evidence base to inform guidance for review authors.
PMID: 27535658 [PubMed - in process]
Corpus Domain Effects on Distributional Semantic Modeling of Medical Terms.
Corpus Domain Effects on Distributional Semantic Modeling of Medical Terms.
Bioinformatics. 2016 Aug 16;
Authors: Pakhomov SV, Finley G, McEwan R, Wang Y, Melton GB
Abstract
MOTIVATION: Automatically quantifying semantic similarity and relatedness between clinical terms is an important aspect of text mining from electronic health records, which are increasingly recognized as valuable sources of phenotypic information for clinical genomics and bioinformatics research. A key obstacle to development of semantic relatedness measures is the limited availability of large quantities of clinical text to researchers and developers outside of major medical centers. Text from general English and biomedical literature are freely available; however, their validity as a substitute for clinical domain to represent semantics of clinical terms remains to be demonstrated.
RESULTS: We constructed neural network representations of clinical terms found in a publicly available benchmark dataset manually labeled for semantic similarity and relatedness. Similarity and relatedness measures computed from text corpora in three domains (Clinical Notes, PubMed Central articles, and Wikipedia) were compared using the benchmark as reference. We found that measures computed from full text of biomedical articles in PubMed Central repository (rho = 0.62 for similarity and 0.58 for relatedness) are on par with measures computed from clinical reports (rho = 0.60 for similarity and 0.57 for relatedness). We also evaluated the use of neural network based relatedness measures for query expansion in a clinical document retrieval task and a biomedical term word sense disambiguation task. We found that, with some limitations, biomedical articles may be used in lieu of clinical reports to represent the semantics of clinical terms and that distributional semantic methods are useful for clinical and biomedical natural language processing applications.
CONTACT: pakh0002@umn.edu, gpfinley@umn.edu, rmcewan@umn.edu, wang2258@umn.edu, gmelton@umn.edu.
PMID: 27531100 [PubMed - as supplied by publisher]
PIPE: a protein-protein interaction passage extraction module for BioCreative challenge.
PIPE: a protein-protein interaction passage extraction module for BioCreative challenge.
Database (Oxford). 2016;2016
Authors: Chang YC, Chu CH, Su YC, Chen CC, Hsu WL
Abstract
Identifying the interactions between proteins mentioned in biomedical literatures is one of the frequently discussed topics of text mining in the life science field. In this article, we propose PIPE, an interaction pattern generation module used in the Collaborative Biocurator Assistant Task at BioCreative V (http://www.biocreative.org/) to capture frequent protein-protein interaction (PPI) patterns within text. We also present an interaction pattern tree (IPT) kernel method that integrates the PPI patterns with convolution tree kernel (CTK) to extract PPIs. Methods were evaluated on LLL, IEPA, HPRD50, AIMed and BioInfer corpora using cross-validation, cross-learning and cross-corpus evaluation. Empirical evaluations demonstrate that our method is effective and outperforms several well-known PPI extraction methods. DATABASE URL.
PMID: 27524807 [PubMed - as supplied by publisher]
A Knowledge Map for Hospital Performance Concept: Extraction and Analysis: A Narrative Review Article.
A Knowledge Map for Hospital Performance Concept: Extraction and Analysis: A Narrative Review Article.
Iran J Public Health. 2016 Jul;45(7):843-54
Authors: Markazi-Moghaddam N, Arab M, Ravaghi H, Rashidian A, Khatibi T, Zargar Balaye Jame S
Abstract
BACKGROUND: Performance is a multi-dimensional and dynamic concept. During the past 2 decades, considerable studies were performed in developing the hospital performance concept. To know literature key concepts on hospital performance, the knowledge visualization based on co-word analysis and social network analysis has been used.
METHODS: Documents were identified through "PubMed" searching from 1945 to 2014 and 2350 papers entered the study after omitting unrelated articles, the duplicates, and articles without abstract. After pre-processing and preparing articles, the key words were extracted and terms were weighted by TF-IDF weighting schema. Support as an interestingness measure, which considers the co-occurrence of the extracted keywords and "hospital performance" phrase was calculated. Keywords having high support with "hospital performance" are selected. Term-term matrix of these selected keywords is calculated and the graph is extracted.
RESULTS: The most high frequency words after "Hospital Performance" were "mortality" and "efficiency". The major knowledge structure of hospital performance literature during these years shows that the keyword "mortality" had the highest support with hospital performance followed by "quality of care", "quality improvement", "discharge", "length of stay" and "clinical outcome". The strongest relationship is seen between "electronic medical record" and "readmission rate".
CONCLUSION: Some dimensions of hospital performance are more important such as "efficiency", "effectiveness", "quality" and "safety" and some indicators are more highlighted such as "mortality", "length of stay", "readmission rate" and "patient satisfaction". In the last decade, some concepts became more significant in hospital performance literature such as "mortality", "quality of care" and "quality improvement".
PMID: 27516990 [PubMed]
BioC viewer: a web-based tool for displaying and merging annotations in BioC.
BioC viewer: a web-based tool for displaying and merging annotations in BioC.
Database (Oxford). 2016;2016
Authors: Shin SY, Kim S, Wilbur WJ, Kwon D
Abstract
BioC is an XML-based format designed to provide interoperability for text mining tools and manual curation results. A challenge of BioC as a standard format is to align annotations from multiple systems. Ideally, this should not be a major problem if users follow guidelines given by BioC key files. Nevertheless, the misalignment between text and annotations happens quite often because different systems tend to use different software development environments, e.g. ASCII vs. Unicode. We first implemented the BioC Viewer to assist BioGRID curators as a part of the BioCreative V BioC track (Collaborative Biocurator Assistant Task). For the BioC track, the BioC Viewer helped curate protein-protein interaction and genetic interaction pairs appearing in full-text articles. Here, we describe the BioC Viewer itself as well as improvements made to the BioC Viewer since the BioCreative V Workshop to address the misalignment issue of BioC annotations. While uploading BioC files, a BioC merge process is offered when there are files from the same full-text article. If there is a mismatch between an annotated offset and text, the BioC Viewer adjusts the offset to correctly align with the text. The BioC Viewer has a user-friendly interface, where most operations can be performed within a few mouse clicks. The feedback from BioGRID curators has been positive for the web interface, particularly for its usability and learnability.Database URL: http://viewer.bioqrator.org.
PMID: 27515823 [PubMed - in process]
ARN: analysis and prediction by adipogenic professional database.
ARN: analysis and prediction by adipogenic professional database.
BMC Syst Biol. 2016;10(1):57
Authors: Huang Y, Wang L, Zan AL
Abstract
Adipogenesis is the process of cell differentiation by which mesenchymal stem cells become adipocytes. Extensive research is ongoing to identify genes, their protein products, and microRNAs that correlate with fat cell development. The existing databases have focused on certain types of regulatory factors and interactions. However, there is no relationship between the results of the experimental studies on adipogenesis and these databases because of the lack of an information center. This information fragmentation hampers the identification of key regulatory genes and pathways. Thus, it is necessary to provide an information center that is quickly and easily accessible to researchers in this field. We selected and integrated data from eight external databases based on the results of text-mining, and constructed a publicly available database and web interface (URL: http://210.27.80.93/arn/ ), which contained 30873 records related to adipogenic differentiation. Then, we designed an online analysis tool to analyze the experimental data or form a scientific hypothesis about adipogenesis through Swanson's literature-based discovery process. Furthermore, we calculated the "Impact Factor" ("IF") value that reflects the importance of each node by counting the numbers of relation records, expression records, and prediction records for each node. This platform can support ongoing adipogenesis research and contribute to the discovery of key regulatory genes and pathways.
PMID: 27503118 [PubMed - in process]
A Pilot Study of a Heuristic Algorithm for Novel Template Identification from VA Electronic Medical Record Text.
A Pilot Study of a Heuristic Algorithm for Novel Template Identification from VA Electronic Medical Record Text.
J Biomed Inform. 2016 Aug 3;
Authors: Redd AM, Gundlapalli AV, Divita G, Carter ME, Tran LT, Samore MH
Abstract
RATIONALE: Templates in text notes pose challenges for automated information extraction algorithms. We propose a method that identifies novel templates in plain text medical notes. The identification can then be used to either include or exclude templates when processing notes for information extraction.
METHODS: The two-module method is based on the framework of information foraging and addresses the hypothesis that documents containing templates and templates within those documents can be identified by common features. The first module is a grouping module that takes documents from the corpus and groups documents with common templates. This is accomplished through a binned word count hierarchical clustering algorithm. The second module performs extraction of the template. It uses the groupings and performs a longest common subsequence (LCS) algorithm to obtain the constituent parts of the templates. The method was developed and tested on a random document corpus of 750 notes derived from a large database of US Department of Veterans Affairs (VA) electronic medical notes.
RESULTS: For the grouping module by using hierarchical clustering we identified 23 groups with 3 documents or more, consisting of 120 documents in total from the 750 documents in our test corpus. Of these, 18 groups had at least one common template that was present in all documents in the group for a positive predictive value of 78%. The LCS extraction module performed with 100% positive predictive value, 94% sensitivity, and 83% negative predictive value. The human review determined that in 4 groups the template covered the entire document, with the remaining 14 groups containing a common section template. Among documents with templates, the number of templates per document ranged from 1 to 14. The mean and median number of templates per group was 5.9 and 5, respectively.
DISCUSSION: The grouping method was successful in finding like documents that contained templates. Of the groups of documents containing templates, the LCS module was successful in deciphering what belonged to the template and what was extraneous. Major obstacles to improved performance included documents that were composed of multiple templates, templates that included other templates embedded within them, and variants of templates. We demonstrate proof of concept of the grouping and extraction method of identifying templates in electronic medical records in this pilot study and propose methods to improve performance and scaling up.
PMID: 27497780 [PubMed - as supplied by publisher]
ModelView for ModelDB: Online Presentation of Model Structure.
ModelView for ModelDB: Online Presentation of Model Structure.
Neuroinformatics. 2015 Oct;13(4):459-70
Authors: McDougal RA, Morse TM, Hines ML, Shepherd GM
Abstract
ModelDB ( modeldb.yale.edu ), a searchable repository of source code of more than 950 published computational neuroscience models, seeks to promote model reuse and reproducibility. Code sharing is a first step; however, model source code is often large and not easily understood. To aid users, we have developed ModelView, a web application for ModelDB that presents a graphical view of model structure augmented with contextual information for NEURON and NEURON-runnable (e.g. NeuroML, PyNN) models. Web presentation provides a rich, simulator-independent environment for interacting with graphs. The necessary data is generated by combining manual curation, text-mining the source code, querying ModelDB, and simulator introspection. Key features of the user interface along with the data analysis, storage, and visualization algorithms are explained. With this tool, researchers can examine and assess the structure of hundreds of models in ModelDB in a standardized presentation without installing any software, downloading the model, or reading model source code.
PMID: 25896640 [PubMed - indexed for MEDLINE]
Digitisation, Big Data, and the Future of the Medical Humanities: Text-Mining and the History of Medicine: Big Data, Big Questions?
Digitisation, Big Data, and the Future of the Medical Humanities: Text-Mining and the History of Medicine: Big Data, Big Questions?
Med Hist. 2016 Apr;60(2):294-6
Authors: Toon E, Timmermann C, Worboys M
PMID: 26971613 [PubMed - indexed for MEDLINE]
Literature-Informed Analysis of a Genome-Wide Association Study of Gestational Age in Norwegian Women and Children Suggests Involvement of Inflammatory Pathways.
Literature-Informed Analysis of a Genome-Wide Association Study of Gestational Age in Norwegian Women and Children Suggests Involvement of Inflammatory Pathways.
PLoS One. 2016;11(8):e0160335
Authors: Bacelis J, Juodakis J, Sengpiel V, Zhang G, Myhre R, Muglia LJ, Nilsson S, Jacobsson B
Abstract
BACKGROUND: Five-to-eighteen percent of pregnancies worldwide end in preterm birth, which is the major cause of neonatal death and morbidity. Approximately 30% of the variation in gestational age at birth can be attributed to genetic factors. Genome-wide association studies (GWAS) have not shown robust evidence of association with genomic loci yet.
METHODS: We separately investigated 1921 Norwegian mothers and 1199 children from pregnancies with spontaneous onset of delivery. Individuals were further divided based on the onset of delivery: initiated by labor or prelabor rupture of membranes. Genetic association with ultrasound-dated gestational age was evaluated using three genetic models and adaptive permutations. The top-ranked loci were tested for enrichment in 12 candidate gene-sets generated by text-mining PubMed abstracts containing pregnancy-related keywords.
RESULTS: The six GWAS did not reveal significant associations, with the most extreme empirical p = 5.1 × 10-7. The top loci from maternal GWAS with deliveries initiated by labor showed significant enrichment in 10 PubMed gene-sets, e.g., p = 0.001 and 0.005 for keywords "uterus" and "preterm" respectively. Enrichment signals were mainly caused by infection/inflammation-related genes TLR4, NFKB1, ABCA1, MMP9. Literature-informed analysis of top loci revealed further immunity genes: IL1A, IL1B, CAMP, TREM1, TFRC, NFKBIA, MEFV, IRF8, WNT5A.
CONCLUSION: Our analyses support the role of inflammatory pathways in determining pregnancy duration and provide a list of 32 candidate genes for a follow-up work. We observed that the top regions from GWAS in mothers with labor-initiated deliveries significantly more often overlap with pregnancy-related genes than would be expected by chance, suggesting that increased sample size would benefit similar studies.
PMID: 27490719 [PubMed - as supplied by publisher]
Erratum to: A novel procedure on next generation sequencing data analysis using text mining algorithm.
Erratum to: A novel procedure on next generation sequencing data analysis using text mining algorithm.
BMC Bioinformatics. 2016;17(1):301
Authors: Zhao W, Chen JJ, Perkins R, Wang Y, Liu Z, Hong H, Tong W, Zou W
PMID: 27489012 [PubMed - as supplied by publisher]
Systems Pharmacology Dissection of the Protective Effect of Myricetin Against Acute Ischemia/Reperfusion-Induced Myocardial Injury in Isolated Rat Heart.
Systems Pharmacology Dissection of the Protective Effect of Myricetin Against Acute Ischemia/Reperfusion-Induced Myocardial Injury in Isolated Rat Heart.
Cardiovasc Toxicol. 2016 Aug 2;
Authors: Qiu Y, Cong N, Liang M, Wang Y, Wang J
Abstract
In this paper, we investigated the multi-target effect of myricetin as a therapeutic for cardiovascular disease, using an acute ischemia/reperfusion-induced myocardial injury model to gain insight into its mechanism of action. The compound-target interaction profiles of myricetin were determined using a combination of text mining, chemometric and chemogenomic methods. The effect of myricetin on cardiac function was investigated by carrying out experiments in rats subjected to ischemia/reperfusion (I/R) using Langendorff retrograde perfusion technology. Compared to the I/R group, pretreatment with 5 μM myricetin was observed to improve the maximum up/down rate of left ventricular pressure (dp/dt max) and coronary flow, raise left ventricular developed pressure, and decrease creatine kinase and lactate dehydrogenase levels in coronary flow. In addition, myricetin treatment was shown to have beneficial effects through its ability to reduce both infarct size and levels of cardiomyocyte apoptosis. Myricetin was also observed to have antioxidant properties, as evidenced by its ability to reduce MDA levels, while increasing both SOD levels and the GSH/GSSG ratio. Finally, an upregulation of 6-phosphogluconate dehydrogenase and fatty acid synthase expression and a downregulation of cyclooxygenase-2, cytochrome P450 and p38 mitogen-activated protein kinase expression suggest that myricetin acts through mechanisms which alter relevant signaling pathways. In summary, our results demonstrate that myricetin has protective cardiovascular effects against I/R-induced myocardial injury.
PMID: 27484498 [PubMed - as supplied by publisher]
iLIR database: a web resource for LIR motif-containing proteins in eukaryotes.
iLIR database: a web resource for LIR motif-containing proteins in eukaryotes.
Autophagy. 2016 Aug 2;:0
Authors: Jacomin AC, Samavedam S, Promponas V, Nezis IP
Abstract
Atg8-family proteins are the best-studied proteins of the core autophagic machinery. They are essential for the elongation and closure of the phagophore into a proper autophagosome. Moreover, Atg8-family proteins are associated with the phagophore from the initiation of the autophagic process to, or just prior to, the fusion between autophagosomes with lysosomes. In addition to their implication in autophagosome biogenesis, they are crucial for selective autophagy through their ability to interact with selective autophagy receptor proteins necessary for the specific targeting of substrates for autophagic degradation. In the last few years it has been revealed that Atg8-interacting proteins include not only receptors but also components of the core autophagic machinery, proteins associated with vesicles and their transport, and specific proteins that are selectively degraded by autophagy. Atg8-interacting proteins contain a short linear LC3-interacting region/LC3 recognition sequence/Atg8-interacting motif (LIR/LRS/AIM) motif which is responsible for their interaction with Atg8-family proteins. These proteins are referred as LIR-containing proteins (LIRCPs). So far, many experimental efforts have been carried out to identify new LIRCPs, leading to the characterization of some of them in the past 10 years. Given the need for the identification of LIRCPs in various organisms, we developed the iLIR database ( https://ilir.warwick.ac.uk ) as a freely available web resource, listing all the putative canonical LIRCPs identified in silico in the proteomes of 8 model organisms using the iLIR server, combined with a Gene Ontology (GO) term analysis. Additionally, a curated text-mining analysis of the literature permitted us to identify novel putative LICRPs in mammals that have not previously been associated with autophagy.
PMID: 27484196 [PubMed - as supplied by publisher]
METSP: a maximum-entropy classifier based text mining tool for transporter-substrate identification with semistructured text.
METSP: a maximum-entropy classifier based text mining tool for transporter-substrate identification with semistructured text.
Biomed Res Int. 2015;2015:254838
Authors: Zhao M, Chen Y, Qu D, Qu H
Abstract
The substrates of a transporter are not only useful for inferring function of the transporter, but also important to discover compound-compound interaction and to reconstruct metabolic pathway. Though plenty of data has been accumulated with the developing of new technologies such as in vitro transporter assays, the search for substrates of transporters is far from complete. In this article, we introduce METSP, a maximum-entropy classifier devoted to retrieve transporter-substrate pairs (TSPs) from semistructured text. Based on the high quality annotation from UniProt, METSP achieves high precision and recall in cross-validation experiments. When METSP is applied to 182,829 human transporter annotation sentences in UniProt, it identifies 3942 sentences with transporter and compound information. Finally, 1547 confidential human TSPs are identified for further manual curation, among which 58.37% pairs with novel substrates not annotated in public transporter databases. METSP is the first efficient tool to extract TSPs from semistructured annotation text in UniProt. This tool can help to determine the precise substrates and drugs of transporters, thus facilitating drug-target prediction, metabolic network reconstruction, and literature classification.
PMID: 26495291 [PubMed - indexed for MEDLINE]
Diagnostic utility of droplet digital PCR for HIV reservoir quantification.
Diagnostic utility of droplet digital PCR for HIV reservoir quantification.
J Virus Erad. 2016;2(3):162-9
Authors: Trypsteen W, Kiselinova M, Vandekerckhove L, De Spiegelaere W
Abstract
Quantitative real-time PCR (qPCR) is implemented in many molecular laboratories worldwide for the quantification of viral nucleic acids. However, over the last two decades, there has been renewed interest in the concept of digital PCR (dPCR) as this platform offers direct quantification without the need for standard curves, a simplified workflow and the possibility to extend the current detection limit. These benefits are of great interest in terms of the quantification of low viral levels in HIV reservoir research because changes in the dynamics of residual HIV reservoirs will be important to monitor HIV cure efforts. Here, we have implemented a systematic literature screening and text mining approach to map the use of droplet dPCR (ddPCR) in the context of HIV quantification. In addition, several technical aspects of ddPCR were compared with qPCR: accuracy, sensitivity, precision and reproducibility, to determine its diagnostic utility. We have observed that ddPCR was used in different body compartments in multiple HIV-1 and HIV-2 assays, with the majority of reported assays focusing on HIV-1 DNA-based applications (i.e. total HIV DNA). Furthermore, ddPCR showed a higher accuracy, precision and reproducibility, but similar sensitivity when compared to qPCR due to reported false positive droplets in the negative template controls with a need for standardised data analysis (i.e. threshold determination). In the context of a low level of detection and HIV reservoir diagnostics, ddPCR can offer a valid alternative to qPCR-based assays but before this platform can be clinically accredited, some remaining issues need to be resolved.
PMID: 27482456 [PubMed]
Gender differences in cancer susceptibility: role of oxidative stress.
Gender differences in cancer susceptibility: role of oxidative stress.
Carcinogenesis. 2016 Jul 31;
Authors: Ali I, Högberg J, Hsieh JH, Auerbach S, Korhonen A, Stenius U, Silins I
Abstract
Cancer is a leading cause of death worldwide and environmental factors, including chemicals, have been suggested as major etiological incitements. Cancer statistics indicates that men get more cancer than women. However, differences in the known risk factors including life-style or occupational exposure only offer partial explanation. Using a text mining tool, we have investigated the scientific literature concerning male- and female-specific rat carcinogens that induced tumors only in one gender in NTP 2-year cancer bioassay. Our evaluation shows that oxidative stress, although frequently reported for both male- and female-specific rat carcinogens, was mentioned significantly more in literature concerning male-specific rat carcinogens. Literature analysis of testosterone and estradiol showed the same pattern. Tox21 high-throughput assay results, although showing only weak association of oxidative stress-related processes for male- and female-specific rat carcinogens, provide additional support. We also analyzed the literature concerning 26 established human carcinogens (IARC group 1). Oxidative stress was more frequently reported for the majority of these carcinogens, and the Tox21 data resembled that of male-specific rat carcinogens. Thus, our data, based on about 600,000 scientific abstracts and Tox21 screening assays, suggest a link between male-specific carcinogens, testosterone and oxidative stress. This implies that a different cellular response to oxidative stress in men and women may be a critical factor in explaining the greater cancer susceptibility observed in men. Although the IARC carcinogens are classified as human carcinogens, their classification largely based on epidemiological evidence from male cohorts, which raises the question whether carcinogen classifications should be gender specific.
PMID: 27481070 [PubMed - as supplied by publisher]
Gene-Disease Interaction Retrieval from Multiple Sources: A Network Based Method.
Gene-Disease Interaction Retrieval from Multiple Sources: A Network Based Method.
Biomed Res Int. 2016;2016:3594517
Authors: Huang L, Wang Y, Wang Y, Bai T
Abstract
The number of gene-related databases has been growing largely along with the research on genes of bioinformatics. Those databases are filled with various gene functions, pathways, interactions, and so forth, while much biomedical knowledge about human diseases is stored as text in all kinds of literatures. Researchers have developed many methods to extract structured biomedical knowledge. Some study and improve text mining algorithms to achieve efficiency in order to cover as many data sources as possible, while some build open source database to accept individual submissions in order to achieve accuracy. This paper combines both efforts and biomedical ontologies to build an interaction network of multiple biomedical ontologies, which guarantees its robustness as well as its wide coverage of biomedical publications. Upon the network, we accomplish an algorithm which discovers paths between concept pairs and shows potential relations.
PMID: 27478829 [PubMed - in process]
Decision Support Environment for Medical Product Safety Surveillance.
Decision Support Environment for Medical Product Safety Surveillance.
J Biomed Inform. 2016 Jul 28;
Authors: Botsis T, Jankosky C, Arya D, Kreimeyer K, Foster M, Pandey A, Wang W, Zhang G, Forshee R, Goud R, Menschik D, Walderhaug M, Woo EJ, Scott J
Abstract
We have developed a Decision Support Environment (DSE) for medical experts at the US Food and Drug Administration (FDA). The DSE contains two integrated systems: The Event-based Text-mining of Health Electronic Records (ETHER) and the Pattern-based and Advanced Network Analyzer for Clinical Evaluation and Assessment (PANACEA). These systems assist medical experts in reviewing reports submitted to the Vaccine Adverse Event Reporting System (VAERS) and the FDA Adverse Event Reporting System (FAERS). In this manuscript, we describe the DSE architecture and key functionalities, and examine its potential contributions to the signal management process by focusing on four use cases: the identification of missing cases from a case series, the identification of duplicate case reports, retrieving cases for a case series analysis, and community detection for signal identification and characterization.
PMID: 27477839 [PubMed - as supplied by publisher]
IPAT: a freely accessible software tool for analyzing multiple patent documents with inbuilt landscape visualizer.
IPAT: a freely accessible software tool for analyzing multiple patent documents with inbuilt landscape visualizer.
Pharm Pat Anal. 2015;4(5):377-86
Authors: Ajay D, Gangwal RP, Sangamwar AT
Abstract
Intelligent Patent Analysis Tool (IPAT) is an online data retrieval tool, operated based on text mining algorithm to extract specific patent information in a predetermined pattern into an Excel sheet. The software is designed and developed to retrieve and analyze technology information from multiple patent documents and generate various patent landscape graphs and charts. The software is C# coded in visual studio 2010, which extracts the publicly available patent information from the web pages like Google Patent and simultaneously study the various technology trends based on user-defined parameters. In other words, IPAT combined with the manual categorization will act as an excellent technology assessment tool in competitive intelligence and due diligence for predicting the future R&D forecast.
PMID: 26452016 [PubMed - indexed for MEDLINE]
Drug Drug Interaction Extraction from Biomedical Literature Using Syntax Convolutional Neural Network.
Drug Drug Interaction Extraction from Biomedical Literature Using Syntax Convolutional Neural Network.
Bioinformatics. 2016 Jul 27;
Authors: Zhao Z, Yang Z, Luo L, Lin H, Wang J
Abstract
MOTIVATION: Detecting drug-drug interaction (DDI) has become a vital part of public health safety. Therefore, using text mining techniques to extract DDIs from biomedical literature has received great attentions. However, this research is still at an early stage and its performance has much room to improve.
RESULTS: In this paper, we present a syntax convolutional neural network (SCNN) based DDI extraction method. In this method, a novel word embedding, syntax word embedding, is proposed to employ the syntactic information of a sentence. Then the position and part of speech (POS) features are introduced to extend the embedding of each word. Later, auto-encoder is introduced to encode the traditional bag-of-words feature (sparse 0-1 vector) as the dense real value vector. Finally, a combination of embedding-based convolutional features and traditional features are fed to the softmax classifier to extract DDIs from biomedical literature. Experimental results on the DDIExtraction 2013 corpus show that SCNN obtains a better performance (an F-score of 0.686) than other state-of-the-art methods.
AVAILABILITY: The source code is available for academic use at http://202.118.75.18:8080/DDI/SCNN-DDI.zip CONTACT: yangzh@dlut.edu.cn SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
PMID: 27466626 [PubMed - as supplied by publisher]