Semantic Web
Integration and Querying of Genomic and Proteomic Semantic Annotations for Biomedical Knowledge Extraction.
Integration and Querying of Genomic and Proteomic Semantic Annotations for Biomedical Knowledge Extraction.
IEEE/ACM Trans Comput Biol Bioinform. 2016 Mar-Apr;13(2):209-19
Authors: Masseroli M, Canakoglu A, Ceri S
Abstract
Understanding complex biological phenomena involves answering complex biomedical questions on multiple biomolecular information simultaneously, which are expressed through multiple genomic and proteomic semantic annotations scattered in many distributed and heterogeneous data sources; such heterogeneity and dispersion hamper the biologists' ability of asking global queries and performing global evaluations. To overcome this problem, we developed a software architecture to create and maintain a Genomic and Proteomic Knowledge Base (GPKB), which integrates several of the most relevant sources of such dispersed information (including Entrez Gene, UniProt, IntAct, Expasy Enzyme, GO, GOA, BioCyc, KEGG, Reactome, and OMIM). Our solution is general, as it uses a flexible, modular, and multilevel global data schema based on abstraction and generalization of integrated data features, and a set of automatic procedures for easing data integration and maintenance, also when the integrated data sources evolve in data content, structure, and number. These procedures also assure consistency, quality, and provenance tracking of all integrated data, and perform the semantic closure of the hierarchical relationships of the integrated biomedical ontologies. At http://www.bioinformatics.deib.polimi.it/GPKB/, a Web interface allows graphical easy composition of queries, although complex, on the knowledge base, supporting also semantic query expansion and comprehensive explorative search of the integrated data to better sustain biomedical knowledge extraction.
PMID: 27045824 [PubMed - indexed for MEDLINE]
A knowledgebase of the human Alu repetitive elements.
A knowledgebase of the human Alu repetitive elements.
J Biomed Inform. 2016 Apr;60:77-83
Authors: Mallona I, Jordà M, Peinado MA
Abstract
Alu elements are the most abundant retrotransposons in the human genome with more than one million copies. Alu repeats have been reported to participate in multiple processes related with genome regulation and compartmentalization. Moreover, they have been involved in the facilitation of pathological mutations in many diseases, including cancer. The contribution of Alus and other repeats in genomic regulation is often overlooked because their study poses technical and analytical challenges hardly attainable with conventional strategies. Here we propose the integration of ontology-based semantic methods to query a knowledgebase for the human Alus. The knowledgebase for the human Alus leverages Sequence (SO) and Gene Ontologies (GO) and is devoted to address functional and genetic information in the genomic context of the Alus. For each Alu element, the closest gene and transcript are stored, as well their functional annotation according to GO, the state of the chromatin and the transcription factors binding sites inside the Alu. The model uses Web Ontology Language (OWL) and Semantic Web Rule Language (SWRL). As a case of use and to illustrate the utility of the tool, we have evaluated the epigenetic states of Alu repeats associated with gene promoters according to their transcriptional activity. The ontology is easily extendable, offering a scaffold for the inclusion of new experimental data. The RDF/XML formalization is freely available at http://aluontology.sourceforge.net/.
PMID: 26827622 [PubMed - indexed for MEDLINE]
Design and Development of a Sharable Clinical Decision Support System Based on a Semantic Web Service Framework.
Design and Development of a Sharable Clinical Decision Support System Based on a Semantic Web Service Framework.
J Med Syst. 2016 May;40(5):118
Authors: Zhang YF, Gou L, Tian Y, Li TC, Zhang M, Li JS
Abstract
Clinical decision support (CDS) systems provide clinicians and other health care stakeholders with patient-specific assessments or recommendations to aid in the clinical decision-making process. Despite their demonstrated potential for improving health care quality, the widespread availability of CDS systems has been limited mainly by the difficulty and cost of sharing CDS knowledge among heterogeneous healthcare information systems. The purpose of this study was to design and develop a sharable clinical decision support (S-CDS) system that meets this challenge. The fundamental knowledge base consists of independent and reusable knowledge modules (KMs) to meet core CDS needs, wherein each KM is semantically well defined based on the standard information model, terminologies, and representation formalisms. A semantic web service framework was developed to identify, access, and leverage these KMs across diverse CDS applications and care settings. The S-CDS system has been validated in two distinct client CDS applications. Model-level evaluation results confirmed coherent knowledge representation. Application-level evaluation results reached an overall accuracy of 98.66 % and a completeness of 96.98 %. The evaluation results demonstrated the technical feasibility and application prospect of our approach. Compared with other CDS engineering efforts, our approach facilitates system development and implementation and improves system maintainability, scalability and efficiency, which contribute to the widespread adoption of effective CDS within the healthcare domain.
PMID: 27002818 [PubMed - indexed for MEDLINE]
Easy Extraction of Terms and Definitions with OWL2TL.
Easy Extraction of Terms and Definitions with OWL2TL.
CEUR Workshop Proc. 2016 Aug;1747:
Authors: Judkins J, Utecht J, Brochhausen M
Abstract
Facilitating good communication between semantic web specialists and domain experts is necessary to efficient ontology development. This development may be hindered by the fact that domain experts tend to be unfamiliar with tools used to create and edit OWL files. This is true in particular when changes to definitions need to be reviewed as often as multiple times a day. We developed "OWL to Term List" (OWL2TL) with the goal of allowing domain experts to view the terms and definitions of an OWL file organized in a list that is updated each time the OWL file is updated. The tool is available online and currently generates a list of terms, along with additional annotation properties that are chosen by the user, in a format that allows easy copying into a spreadsheet.
PMID: 28035214 [PubMed]
Temporal data representation, normalization, extraction, and reasoning: A review from clinical domain.
Temporal data representation, normalization, extraction, and reasoning: A review from clinical domain.
Comput Methods Programs Biomed. 2016 May;128:52-68
Authors: Madkour M, Benhaddou D, Tao C
Abstract
BACKGROUND AND OBJECTIVE: We live our lives by the calendar and the clock, but time is also an abstraction, even an illusion. The sense of time can be both domain-specific and complex, and is often left implicit, requiring significant domain knowledge to accurately recognize and harness. In the clinical domain, the momentum gained from recent advances in infrastructure and governance practices has enabled the collection of tremendous amount of data at each moment in time. Electronic health records (EHRs) have paved the way to making these data available for practitioners and researchers. However, temporal data representation, normalization, extraction and reasoning are very important in order to mine such massive data and therefore for constructing the clinical timeline. The objective of this work is to provide an overview of the problem of constructing a timeline at the clinical point of care and to summarize the state-of-the-art in processing temporal information of clinical narratives.
METHODS: This review surveys the methods used in three important area: modeling and representing of time, medical NLP methods for extracting time, and methods of time reasoning and processing. The review emphasis on the current existing gap between present methods and the semantic web technologies and catch up with the possible combinations.
RESULTS: The main findings of this review are revealing the importance of time processing not only in constructing timelines and clinical decision support systems but also as a vital component of EHR data models and operations.
CONCLUSIONS: Extracting temporal information in clinical narratives is a challenging task. The inclusion of ontologies and semantic web will lead to better assessment of the annotation task and, together with medical NLP techniques, will help resolving granularity and co-reference resolution problems.
PMID: 27040831 [PubMed - indexed for MEDLINE]
Drug-drug interaction discovery and demystification using Semantic Web technologies.
Drug-drug interaction discovery and demystification using Semantic Web technologies.
J Am Med Inform Assoc. 2016 Dec 28;:
Authors: Noor A, Assiri A, Ayvaz S, Clark C, Dumontier M
Abstract
OBJECTIVE: To develop a novel pharmacovigilance inferential framework to infer mechanistic explanations for asserted drug-drug interactions (DDIs) and deduce potential DDIs.
MATERIALS AND METHODS: A mechanism-based DDI knowledge base was constructed by integrating knowledge from several existing sources at the pharmacokinetic, pharmacodynamic, pharmacogenetic, and multipathway interaction levels. A query-based framework was then created to utilize this integrated knowledge base in conjunction with 9 inference rules to infer mechanistic explanations for asserted DDIs and deduce potential DDIs.
RESULTS: The drug-drug interactions discovery and demystification (D3) system achieved an overall 85% recall rate in terms of inferring mechanistic explanations for the DDIs integrated into its knowledge base, while demonstrating a 61% precision rate in terms of the inference or lack of inference of mechanistic explanations for a balanced, randomly selected collection of interacting and noninteracting drug pairs.
DISCUSSION: The successful demonstration of the D3 system's ability to confirm interactions involving well-studied drugs enhances confidence in its ability to deduce interactions involving less-studied drugs. In its demonstration, the D3 system infers putative explanations for most of its integrated DDIs. Further enhancements to this work in the future might include ranking interaction mechanisms based on likelihood of applicability, determining the likelihood of deduced DDIs, and making the framework publicly available.
CONCLUSION: The D3 system provides an early-warning framework for augmenting knowledge of known DDIs and deducing unknown DDIs. It shows promise in suggesting interaction pathways of research and evaluation interest and aiding clinicians in evaluating and adjusting courses of drug therapy.
PMID: 28031284 [PubMed - as supplied by publisher]
Semantic Dementia: A Mini-Review.
Semantic Dementia: A Mini-Review.
Mini Rev Med Chem. 2016 Dec 23;
Authors: Klimova B, Novotny M, Kuca K
Abstract
BACKGROUND: At present there are about 47.5 million people suffering from different types of dementia and by 2030 this number should reach 75.6 million. This obviously brings about serious social and economic burden for people suffering from any kind of dementia.
OBJECTIVE: The purpose of this article is to explore only semantic dementia (SD) as one of the forms of frontotemporal dementia (FTD) and provide the latest information on its diagnosis and treatment which play a significant role in the maintenance of quality of life of both patients and their caregivers. Especially unimpaired communication is one of the key factors in the relationship between the patients and their caregivers.
METHODS: The methods used for this mini review include a method of literature review of available sources found in the world's acknowledged databases such as Web of Science, PubMed, Springer and Scopus from the period of 2010 up to the present time; and a method of comparison and evaluation of the selected studies.
RESULTS: The findings of this mini review show that FTD, respectively SD, is a serious neurodegenerative disorder which has fatal consequences for the affected patients. In addition, the findings also indicate that there are not many possibilities of pharmacological treatment for semantic dementia and therefore more attention should be paid to alternative, non-pharmacological approaches.
CONCLUSION: Although semantic dementia is a relatively rare neurodegenerative disorder if compared with other types of dementia, it has an irreversible impact on patient's and his/her caregiver's life in terms of quality.
PMID: 28019640 [PubMed - as supplied by publisher]
Computational modeling of brain pathologies: the case of multiple sclerosis.
Computational modeling of brain pathologies: the case of multiple sclerosis.
Brief Bioinform. 2016 Dec 22;:
Authors: Pappalardo F, Rajput AM, Motta S
Abstract
The central nervous system is the most complex network of the human body. The existence and functionality of a large number of molecular species in human brain are still ambiguous and mostly unknown, thus posing a challenge to Science and Medicine. Neurological diseases inherit the same level of complexity, making effective treatments difficult to be found. Multiple sclerosis (MS) is a major neurological disease that causes severe inabilities and also a significant social burden on health care system: between 2 and 2.5 million people are affected by it, and the cost associated with it is significantly higher as compared with other neurological diseases because of the chronic nature of the disease and to the partial efficacy of current therapies. Despite difficulties in understanding and treating MS, many computational models have been developed to help neurologists. In the present work, we briefly review the main characteristics of MS and present a selection criteria of modeling approaches.
PMID: 28011755 [PubMed - as supplied by publisher]
A Computational Chemistry Data Management Platform Based on the Semantic Web.
A Computational Chemistry Data Management Platform Based on the Semantic Web.
J Phys Chem A. 2016 Dec 12;
Authors: Wang B, Dobosh PA, Chalk SJ, Sopek M, Ostlund NS
Abstract
This paper presents a formal data publishing platform for computational chemistry using semantic web technologies. This platform encapsulates computational chemistry data from a variety of packages in an Extensible Markup Language (XML) file called CSX (Common Standard for eXchange). Based on a Gainesville Core (GC) ontology for computational chemistry, the CSX XML file is converted into the JavaScript Object Notation for Linked Data (JSON-LD) format using an XML Stylesheet Language Transformation (XSLT) file. Ultimately the JSON-LD file is converted to subject-predicate-object triples in a Turtle (TTL) file and published on the web portal. By leveraging semantic web technologies, we are able to place computational chemistry data onto web portals as a component of a Giant Global Graph (GGG) such that computer agents, as well as individual chemists, can access the data.
PMID: 27936706 [PubMed - as supplied by publisher]
LifeWatch Greece data-services: Discovering Biodiversity Data using Semantic Web Technologies.
LifeWatch Greece data-services: Discovering Biodiversity Data using Semantic Web Technologies.
Biodivers Data J. 2016;(4):e8443
Authors: Minadakis N, Marketakis Y, Doerr M, Bekiari C, Papadakos P, Gougousis A, Bailly N, Arvanitidis C
Abstract
BACKGROUND: Biodiversity data is characterized by its cross-disciplinary character, the extremely broad range of data types and structures, and the variety of semantic concepts that it encompasses. Furthermore there is a plethora of different data sources providing resources for the same piece of information in a heterogeneous way. Even if we restrict our attention to Greek biodiversity domain, it is easy to see that biodiversity data remains unconnected and widely distributed among different sources.
NEW INFORMATION: To cope with these issues, in the context of the LifeWatch Greece project, i) we supported cataloguing and publishing of all the relevant metadata information of the Greek biodiversity domain, ii) we integrated data from heterogeneous sources by supporting the definitions of appropriate models, iii) we provided means for efficiently discovering biodiversity data of interest and iv) we enabled the answering of complex queries that could not be answered from the individual sources. This work has been exploited, evaluated and scientificaly confirmed by the biodiversity community through the services provided by the LifeWatch Greece portal.
PMID: 27932908 [PubMed]
Publication of nuclear magnetic resonance experimental data with semantic web technology and the application thereof to biomedical research of proteins.
Publication of nuclear magnetic resonance experimental data with semantic web technology and the application thereof to biomedical research of proteins.
J Biomed Semantics. 2016 May 05;7(1):16
Authors: Yokochi M, Kobayashi N, Ulrich EL, Kinjo AR, Iwata T, Ioannidis YE, Livny M, Markley JL, Nakamura H, Kojima C, Fujiwara T
Abstract
BACKGROUND: The nuclear magnetic resonance (NMR) spectroscopic data for biological macromolecules archived at the BioMagResBank (BMRB) provide a rich resource of biophysical information at atomic resolution. The NMR data archived in NMR-STAR ASCII format have been implemented in a relational database. However, it is still fairly difficult for users to retrieve data from the NMR-STAR files or the relational database in association with data from other biological databases.
FINDINGS: To enhance the interoperability of the BMRB database, we present a full conversion of BMRB entries to two standard structured data formats, XML and RDF, as common open representations of the NMR-STAR data. Moreover, a SPARQL endpoint has been deployed. The described case study demonstrates that a simple query of the SPARQL endpoints of the BMRB, UniProt, and Online Mendelian Inheritance in Man (OMIM), can be used in NMR and structure-based analysis of proteins combined with information of single nucleotide polymorphisms (SNPs) and their phenotypes.
CONCLUSIONS: We have developed BMRB/XML and BMRB/RDF and demonstrate their use in performing a federated SPARQL query linking the BMRB to other databases through standard semantic web technologies. This will facilitate data exchange across diverse information resources.
PMID: 27927232 [PubMed - in process]
DNA Data Bank of Japan.
DNA Data Bank of Japan.
Nucleic Acids Res. 2016 Oct 24;:
Authors: Mashima J, Kodama Y, Fujisawa T, Katayama T, Okuda Y, Kaminuma E, Ogasawara O, Okubo K, Nakamura Y, Takagi T
Abstract
The DNA Data Bank of Japan (DDBJ) (http://www.ddbj.nig.ac.jp) has been providing public data services for thirty years (since 1987). We are collecting nucleotide sequence data from researchers as a member of the International Nucleotide Sequence Database Collaboration (INSDC, http://www.insdc.org), in collaboration with the US National Center for Biotechnology Information (NCBI) and European Bioinformatics Institute (EBI). The DDBJ Center also services Japanese Genotype-phenotype Archive (JGA), with the National Bioscience Database Center to collect human-subjected data from Japanese researchers. Here, we report our database activities for INSDC and JGA over the past year, and introduce retrieval and analytical services running on our supercomputer system and their recent modifications. Furthermore, with the Database Center for Life Science, the DDBJ Center improves semantic web technologies to integrate and to share biological data, for providing the RDF version of the sequence data.
PMID: 27924010 [PubMed - as supplied by publisher]
The KIT Motion-Language Dataset.
The KIT Motion-Language Dataset.
Big Data. 2016 Dec;4(4):236-252
Authors: Plappert M, Mandery C, Asfour T
Abstract
Linking human motion and natural language is of great interest for the generation of semantic representations of human activities as well as for the generation of robot activities based on natural language input. However, although there have been years of research in this area, no standardized and openly available data set exists to support the development and evaluation of such systems. We, therefore, propose the Karlsruhe Institute of Technology (KIT) Motion-Language Dataset, which is large, open, and extensible. We aggregate data from multiple motion capture databases and include them in our data set using a unified representation that is independent of the capture system or marker set, making it easy to work with the data regardless of its origin. To obtain motion annotations in natural language, we apply a crowd-sourcing approach and a web-based tool that was specifically build for this purpose, the Motion Annotation Tool. We thoroughly document the annotation process itself and discuss gamification methods that we used to keep annotators motivated. We further propose a novel method, perplexity-based selection, which systematically selects motions for further annotation that are either under-represented in our data set or that have erroneous annotations. We show that our method mitigates the two aforementioned problems and ensures a systematic annotation process. We provide an in-depth analysis of the structure and contents of our resulting data set, which, as of October 10, 2016, contains 3911 motions with a total duration of 11.23 hours and 6278 annotations in natural language that contain 52,903 words. We believe this makes our data set an excellent choice that enables more transparent and comparable research in this important area.
PMID: 27992262 [PubMed]
(semantic[Title/Abstract] AND web[Title/Abstract]) AND ("2005/01/01"[PDAT] : "3000"[PDAT]); +24 new citations
24 new pubmed citations were retrieved for your search. Click on the search hyperlink below to display the complete search results:
(semantic[Title/Abstract] AND web[Title/Abstract]) AND ("2005/01/01"[PDAT] : "3000"[PDAT])
These pubmed results were generated on 2016/12/15
PubMed comprises more than 24 million citations for biomedical literature from MEDLINE, life science journals, and online books. Citations may include links to full-text content from PubMed Central and publisher web sites.
Isomorphic semantic mapping of variant call format (VCF2RDF).
Isomorphic semantic mapping of variant call format (VCF2RDF).
Bioinformatics. 2016 Oct 25;:
Authors: Penha ED, Iriabho E, Dussaq A, Magalhães de Oliveira D, Almeida JS
Abstract
The move of computational genomics workflows to Cloud Computing platforms is associated with a new level of integration and interoperability that challenges existing data representation formats. The Variant Calling Format (VCF) is in a particularly sensitive position in that regard, with both clinical and consumer-facing analysis tools relying on this self-contained description of genomic variation in Next Generation Sequencing (NGS) results. In this report we identify an isomorphic map between VCF and the reference Resource Description Framework. RDF is advanced by the World Wide Web Consortium (W3C) to enable representations of linked data that are both distributed and discoverable. The resulting ability to decompose VCF reports of genomic variation without loss of context addresses the need to modularize and govern NGS pipelines for Precision Medicine. Specifically, it provides the flexibility (i.e. the indexing) needed to support the wide variety of clinical scenarios and patient-facing governance where only part of the VCF data is fitting.
IMPLEMENTATION: Software libraries with a claim to be both domain-facing and consumer-facing have to pass the test of portability across the variety of devices that those consumers in fact adopt. That is, ideally the implementation should itself take place within the space defined by web technologies. Consequently, the isomorphic mapping function was implemented in JavaScript, and was tested in a variety of environments and devices, client and server side alike. These range from web browsers in mobile phones to the most popular micro service platform, NodeJS.
AVAILABILITY: The code is publicly available at https://github.com/ibl/VCFr , with a live deployment at: http://ibl.github.io/VCFr/ CONTACT: Jonas.almeida@stonybrook.edu.
PMID: 27797761 [PubMed - as supplied by publisher]
Lessons learned in the generation of biomedical research datasets using Semantic Open Data technologies.
Lessons learned in the generation of biomedical research datasets using Semantic Open Data technologies.
Stud Health Technol Inform. 2015;210:165-9
Authors: Legaz-García Mdel C, Miñarro-Giménez JA, Menárguez-Tortosa M, Fernández-Breis JT
Abstract
Biomedical research usually requires combining large volumes of data from multiple heterogeneous sources. Such heterogeneity makes difficult not only the generation of research-oriented dataset but also its exploitation. In recent years, the Open Data paradigm has proposed new ways for making data available in ways that sharing and integration are facilitated. Open Data approaches may pursue the generation of content readable only by humans and by both humans and machines, which are the ones of interest in our work. The Semantic Web provides a natural technological space for data integration and exploitation and offers a range of technologies for generating not only Open Datasets but also Linked Datasets, that is, open datasets linked to other open datasets. According to the Berners-Lee's classification, each open dataset can be given a rating between one and five stars attending to can be given to each dataset. In the last years, we have developed and applied our SWIT tool, which automates the generation of semantic datasets from heterogeneous data sources. SWIT produces four stars datasets, given that fifth one can be obtained by being the dataset linked from external ones. In this paper, we describe how we have applied the tool in two projects related to health care records and orthology data, as well as the major lessons learned from such efforts.
PMID: 25991123 [PubMed - indexed for MEDLINE]
Developing a modular architecture for creation of rule-based clinical diagnostic criteria.
Developing a modular architecture for creation of rule-based clinical diagnostic criteria.
BioData Min. 2016;9:33
Authors: Hong N, Pathak J, Chute CG, Jiang G
Abstract
BACKGROUND: With recent advances in computerized patient records system, there is an urgent need for producing computable and standards-based clinical diagnostic criteria. Notably, constructing rule-based clinical diagnosis criteria has become one of the goals in the International Classification of Diseases (ICD)-11 revision. However, few studies have been done in building a unified architecture to support the need for diagnostic criteria computerization. In this study, we present a modular architecture for enabling the creation of rule-based clinical diagnostic criteria leveraging Semantic Web technologies.
METHODS AND RESULTS: The architecture consists of two modules: an authoring module that utilizes a standards-based information model and a translation module that leverages Semantic Web Rule Language (SWRL). In a prototype implementation, we created a diagnostic criteria upper ontology (DCUO) that integrates ICD-11 content model with the Quality Data Model (QDM). Using the DCUO, we developed a transformation tool that converts QDM-based diagnostic criteria into Semantic Web Rule Language (SWRL) representation. We evaluated the domain coverage of the upper ontology model using randomly selected diagnostic criteria from broad domains (n = 20). We also tested the transformation algorithms using 6 QDM templates for ontology population and 15 QDM-based criteria data for rule generation. As the results, the first draft of DCUO contains 14 root classes, 21 subclasses, 6 object properties and 1 data property. Investigation Findings, and Signs and Symptoms are the two most commonly used element types. All 6 HQMF templates are successfully parsed and populated into their corresponding domain specific ontologies and 14 rules (93.3 %) passed the rule validation.
CONCLUSION: Our efforts in developing and prototyping a modular architecture provide useful insight into how to build a scalable solution to support diagnostic criteria representation and computerization.
PMID: 27785153 [PubMed - in process]
Using Semantic Web technologies for the generation of domain-specific templates to support clinical study metadata standards.
Using Semantic Web technologies for the generation of domain-specific templates to support clinical study metadata standards.
J Biomed Semantics. 2016;7:10
Authors: Jiang G, Evans J, Endle CM, Solbrig HR, Chute CG
Abstract
BACKGROUND: The Biomedical Research Integrated Domain Group (BRIDG) model is a formal domain analysis model for protocol-driven biomedical research, and serves as a semantic foundation for application and message development in the standards developing organizations (SDOs). The increasing sophistication and complexity of the BRIDG model requires new approaches to the management and utilization of the underlying semantics to harmonize domain-specific standards. The objective of this study is to develop and evaluate a Semantic Web-based approach that integrates the BRIDG model with ISO 21090 data types to generate domain-specific templates to support clinical study metadata standards development.
METHODS: We developed a template generation and visualization system based on an open source Resource Description Framework (RDF) store backend, a SmartGWT-based web user interface, and a "mind map" based tool for the visualization of generated domain-specific templates. We also developed a RESTful Web Service informed by the Clinical Information Modeling Initiative (CIMI) reference model for access to the generated domain-specific templates.
RESULTS: A preliminary usability study is performed and all reviewers (n = 3) had very positive responses for the evaluation questions in terms of the usability and the capability of meeting the system requirements (with the average score of 4.6).
CONCLUSIONS: Semantic Web technologies provide a scalable infrastructure and have great potential to enable computable semantic interoperability of models in the intersection of health care and clinical research.
PMID: 26949508 [PubMed - indexed for MEDLINE]
Portal of medical data models: information infrastructure for medical research and healthcare.
Portal of medical data models: information infrastructure for medical research and healthcare.
Database (Oxford). 2016;2016:
Authors: Dugas M, Neuhaus P, Meidt A, Doods J, Storck M, Bruland P, Varghese J
Abstract
INTRODUCTION: Information systems are a key success factor for medical research and healthcare. Currently, most of these systems apply heterogeneous and proprietary data models, which impede data exchange and integrated data analysis for scientific purposes. Due to the complexity of medical terminology, the overall number of medical data models is very high. At present, the vast majority of these models are not available to the scientific community. The objective of the Portal of Medical Data Models (MDM, https://medical-data-models.org) is to foster sharing of medical data models.
METHODS: MDM is a registered European information infrastructure. It provides a multilingual platform for exchange and discussion of data models in medicine, both for medical research and healthcare. The system is developed in collaboration with the University Library of Münster to ensure sustainability. A web front-end enables users to search, view, download and discuss data models. Eleven different export formats are available (ODM, PDF, CDA, CSV, MACRO-XML, REDCap, SQL, SPSS, ADL, R, XLSX). MDM contents were analysed with descriptive statistics.
RESULTS: MDM contains 4387 current versions of data models (in total 10,963 versions). 2475 of these models belong to oncology trials. The most common keyword (n = 3826) is 'Clinical Trial'; most frequent diseases are breast cancer, leukemia, lung and colorectal neoplasms. Most common languages of data elements are English (n = 328,557) and German (n = 68,738). Semantic annotations (UMLS codes) are available for 108,412 data items, 2453 item groups and 35,361 code list items. Overall 335,087 UMLS codes are assigned with 21,847 unique codes. Few UMLS codes are used several thousand times, but there is a long tail of rarely used codes in the frequency distribution.
DISCUSSION: Expected benefits of the MDM portal are improved and accelerated design of medical data models by sharing best practice, more standardised data models with semantic annotation and better information exchange between information systems, in particular Electronic Data Capture (EDC) and Electronic Health Records (EHR) systems. Contents of the MDM portal need to be further expanded to reach broad coverage of all relevant medical domains. Database URL: https://medical-data-models.org.
PMID: 26868052 [PubMed - indexed for MEDLINE]
A new visual navigation system for exploring biomedical Open Educational Resource (OER) videos.
A new visual navigation system for exploring biomedical Open Educational Resource (OER) videos.
J Am Med Inform Assoc. 2016 Apr;23(e1):e34-41
Authors: Zhao B, Xu S, Lin S, Luo X, Duan L
Abstract
OBJECTIVE: Biomedical videos as open educational resources (OERs) are increasingly proliferating on the Internet. Unfortunately, seeking personally valuable content from among the vast corpus of quality yet diverse OER videos is nontrivial due to limitations of today's keyword- and content-based video retrieval techniques. To address this need, this study introduces a novel visual navigation system that facilitates users' information seeking from biomedical OER videos in mass quantity by interactively offering visual and textual navigational clues that are both semantically revealing and user-friendly.
MATERIALS AND METHODS: The authors collected and processed around 25 000 YouTube videos, which collectively last for a total length of about 4000 h, in the broad field of biomedical sciences for our experiment. For each video, its semantic clues are first extracted automatically through computationally analyzing audio and visual signals, as well as text either accompanying or embedded in the video. These extracted clues are subsequently stored in a metadata database and indexed by a high-performance text search engine. During the online retrieval stage, the system renders video search results as dynamic web pages using a JavaScript library that allows users to interactively and intuitively explore video content both efficiently and effectively.ResultsThe authors produced a prototype implementation of the proposed system, which is publicly accessible athttps://patentq.njit.edu/oer To examine the overall advantage of the proposed system for exploring biomedical OER videos, the authors further conducted a user study of a modest scale. The study results encouragingly demonstrate the functional effectiveness and user-friendliness of the new system for facilitating information seeking from and content exploration among massive biomedical OER videos.
CONCLUSION: Using the proposed tool, users can efficiently and effectively find videos of interest, precisely locate video segments delivering personally valuable information, as well as intuitively and conveniently preview essential content of a single or a collection of videos.
PMID: 26335986 [PubMed - indexed for MEDLINE]