Semantic Web
Fish Ontology framework for taxonomy-based fish recognition.
Fish Ontology framework for taxonomy-based fish recognition.
PeerJ. 2017;5:e3811
Authors: Ali NM, Khan HA, Then AY, Ving Ching C, Gaur M, Dhillon SK
Abstract
Life science ontologies play an important role in Semantic Web. Given the diversity in fish species and the associated wealth of information, it is imperative to develop an ontology capable of linking and integrating this information in an automated fashion. As such, we introduce the Fish Ontology (FO), an automated classification architecture of existing fish taxa which provides taxonomic information on unknown fish based on metadata restrictions. It is designed to support knowledge discovery, provide semantic annotation of fish and fisheries resources, data integration, and information retrieval. Automated classification for unknown specimens is a unique feature that currently does not appear to exist in other known ontologies. Examples of automated classification for major groups of fish are demonstrated, showing the inferred information by introducing several restrictions at the species or specimen level. The current version of FO has 1,830 classes, includes widely used fisheries terminology, and models major aspects of fish taxonomy, grouping, and character. With more than 30,000 known fish species globally, the FO will be an indispensable tool for fish scientists and other interested users.
PMID: 28929028 [PubMed]
Learning From Short Text Streams With Topic Drifts.
Learning From Short Text Streams With Topic Drifts.
IEEE Trans Cybern. 2017 Sep 18;:
Authors: Li P, He L, Wang H, Hu X, Zhang Y, Li L, Wu X
Abstract
Short text streams such as search snippets and micro blogs have been popular on the Web with the emergence of social media. Unlike traditional normal text streams, these data present the characteristics of short length, weak signal, high volume, high velocity, topic drift, etc. Short text stream classification is hence a very challenging and significant task. However, this challenge has received little attention from the research community. Therefore, a new feature extension approach is proposed for short text stream classification with the help of a large-scale semantic network obtained from a Web corpus. It is built on an incremental ensemble classification model for efficiency. First, more semantic contexts based on the senses of terms in short texts are introduced to make up of the data sparsity using the open semantic network, in which all terms are disambiguated by their semantics to reduce the noise impact. Second, a concept cluster-based topic drifting detection method is proposed to effectively track hidden topic drifts. Finally, extensive studies demonstrate that as compared to several well-known concept drifting detection methods in data stream, our approach can detect topic drifts effectively, and it enables handling short text streams effectively while maintaining the efficiency as compared to several state-of-the-art short text classification approaches.
PMID: 28922135 [PubMed - as supplied by publisher]
Extending XNAT Platform with an Incremental Semantic Framework.
Extending XNAT Platform with an Incremental Semantic Framework.
Front Neuroinform. 2017;11:57
Authors: Timón S, Rincón M, Martínez-Tomás R
Abstract
Informatics increases the yield from neuroscience due to improved data. Data sharing and accessibility enable joint efforts between different research groups, as well as replication studies, pivotal for progress in the field. Research data archiving solutions are evolving rapidly to address these necessities, however, distributed data integration is still difficult because of the need of explicit agreements for disparate data models. To address these problems, ontologies are widely used in biomedical research to obtain common vocabularies and logical descriptions, but its application may suffer from scalability issues, domain bias, and loss of low-level data access. With the aim of improving the application of semantic models in biobanking systems, an incremental semantic framework that takes advantage of the latest advances in biomedical ontologies and the XNAT platform is designed and implemented. We follow a layered architecture that allows the alignment of multi-domain biomedical ontologies to manage data at different levels of abstraction. To illustrate this approach, the development is integrated in the JPND (EU Joint Program for Neurodegenerative Disease) APGeM project, focused on finding early biomarkers for Alzheimer's and other dementia related diseases.
PMID: 28912709 [PubMed]
Optimizing a Query by Transformation and Expansion.
Optimizing a Query by Transformation and Expansion.
Stud Health Technol Inform. 2017;243:197-201
Authors: Glocker K, Knurr A, Dieter J, Dominick F, Forche M, Koch C, Pascoe Pérez A, Roth B, Ückert F
Abstract
In the biomedical sector not only the amount of information produced and uploaded into the web is enormous, but also the number of sources where these data can be found. Clinicians and researchers spend huge amounts of time on trying to access this information and to filter the most important answers to a given question. As the formulation of these queries is crucial, automated query expansion is an effective tool to optimize a query and receive the best possible results. In this paper we introduce the concept of a workflow for an optimization of queries in the medical and biological sector by using a series of tools for expansion and transformation of the query. After the definition of attributes by the user, the query string is compared to previous queries in order to add semantic co-occurring terms to the query. Additionally, the query is enlarged by an inclusion of synonyms. The translation into database specific ontologies ensures the optimal query formulation for the chosen database(s). As this process can be performed in various databases at once, the results are ranked and normalized in order to achieve a comparable list of answers for a question.
PMID: 28883200 [PubMed - in process]
Expert2OWL: A Methodology for Pattern-Based Ontology Development.
Expert2OWL: A Methodology for Pattern-Based Ontology Development.
Stud Health Technol Inform. 2017;243:165-169
Authors: Tahar K, Xu J, Herre H
Abstract
The formalization of expert knowledge enables a broad spectrum of applications employing ontologies as underlying technology. These include eLearning, Semantic Web and expert systems. However, the manual construction of such ontologies is time-consuming and thus expensive. Moreover, experts are often unfamiliar with the syntax and semantics of formal ontology languages such as OWL and usually have no experience in developing formal ontologies. To overcome these barriers, we developed a new method and tool, called Expert2OWL that provides efficient features to support the construction of OWL ontologies using GFO (General Formal Ontology) as a top-level ontology. This method allows a close and effective collaboration between ontologists and domain experts. Essentially, this tool integrates Excel spreadsheets as part of a pattern-based ontology development and refinement process. Expert2OWL enables us to expedite the development process and modularize the resulting ontologies. We applied this method in the field of Chinese Herbal Medicine (CHM) and used Expert2OWL to automatically generate an accurate Chinese Herbology ontology (CHO). The expressivity of CHO was tested and evaluated using ontology query languages SPARQL and DL. CHO shows promising results and can generate answers to important scientific questions such as which Chinese herbal formulas contain which substances, which substances treat which diseases, and which ones are the most frequently used in CHM.
PMID: 28883193 [PubMed - in process]
BIOSSES: a semantic sentence similarity estimation system for the biomedical domain.
BIOSSES: a semantic sentence similarity estimation system for the biomedical domain.
Bioinformatics. 2017 Jul 15;33(14):i49-i58
Authors: Sogancioglu G, Öztürk H, Özgür A
Abstract
Motivation: The amount of information available in textual format is rapidly increasing in the biomedical domain. Therefore, natural language processing (NLP) applications are becoming increasingly important to facilitate the retrieval and analysis of these data. Computing the semantic similarity between sentences is an important component in many NLP tasks including text retrieval and summarization. A number of approaches have been proposed for semantic sentence similarity estimation for generic English. However, our experiments showed that such approaches do not effectively cover biomedical knowledge and produce poor results for biomedical text.
Methods: We propose several approaches for sentence-level semantic similarity computation in the biomedical domain, including string similarity measures and measures based on the distributed vector representations of sentences learned in an unsupervised manner from a large biomedical corpus. In addition, ontology-based approaches are presented that utilize general and domain-specific ontologies. Finally, a supervised regression based model is developed that effectively combines the different similarity computation metrics. A benchmark data set consisting of 100 sentence pairs from the biomedical literature is manually annotated by five human experts and used for evaluating the proposed methods.
Results: The experiments showed that the supervised semantic sentence similarity computation approach obtained the best performance (0.836 correlation with gold standard human annotations) and improved over the state-of-the-art domain-independent systems up to 42.6% in terms of the Pearson correlation metric.
Availability and implementation: A web-based system for biomedical semantic sentence similarity computation, the source code, and the annotated benchmark data set are available at: http://tabilab.cmpe.boun.edu.tr/BIOSSES/ .
Contact: gizemsogancioglu@gmail.com or arzucan.ozgur@boun.edu.tr.
PMID: 28881973 [PubMed - in process]
Accent modulates access to word meaning: Evidence for a speaker-model account of spoken word recognition.
Accent modulates access to word meaning: Evidence for a speaker-model account of spoken word recognition.
Cogn Psychol. 2017 Sep 04;98:73-101
Authors: Cai ZG, Gilbert RA, Davis MH, Gaskell MG, Farrar L, Adler S, Rodd JM
Abstract
Speech carries accent information relevant to determining the speaker's linguistic and social background. A series of web-based experiments demonstrate that accent cues can modulate access to word meaning. In Experiments 1-3, British participants were more likely to retrieve the American dominant meaning (e.g., hat meaning of "bonnet") in a word association task if they heard the words in an American than a British accent. In addition, results from a speeded semantic decision task (Experiment 4) and sentence comprehension task (Experiment 5) confirm that accent modulates on-line meaning retrieval such that comprehension of ambiguous words is easier when the relevant word meaning is dominant in the speaker's dialect. Critically, neutral-accent speech items, created by morphing British- and American-accented recordings, were interpreted in a similar way to accented words when embedded in a context of accented words (Experiment 2). This finding indicates that listeners do not use accent to guide meaning retrieval on a word-by-word basis; instead they use accent information to determine the dialectic identity of a speaker and then use their experience of that dialect to guide meaning access for all words spoken by that person. These results motivate a speaker-model account of spoken word recognition in which comprehenders determine key characteristics of their interlocutor and use this knowledge to guide word meaning access.
PMID: 28881224 [PubMed - as supplied by publisher]
RDFIO: extending Semantic MediaWiki for interoperable biomedical data management.
RDFIO: extending Semantic MediaWiki for interoperable biomedical data management.
J Biomed Semantics. 2017 Sep 04;8(1):35
Authors: Lampa S, Willighagen E, Kohonen P, King A, Vrandečić D, Grafström R, Spjuth O
Abstract
BACKGROUND: Biological sciences are characterised not only by an increasing amount but also the extreme complexity of its data. This stresses the need for efficient ways of integrating these data in a coherent description of biological systems. In many cases, biological data needs organization before integration. This is not seldom a collaborative effort, and it is thus important that tools for data integration support a collaborative way of working. Wiki systems with support for structured semantic data authoring, such as Semantic MediaWiki, provide a powerful solution for collaborative editing of data combined with machine-readability, so that data can be handled in an automated fashion in any downstream analyses. Semantic MediaWiki lacks a built-in data import function though, which hinders efficient round-tripping of data between interoperable Semantic Web formats such as RDF and the internal wiki format.
RESULTS: To solve this deficiency, the RDFIO suite of tools is presented, which supports importing of RDF data into Semantic MediaWiki, with metadata needed to export it again in the same RDF format, or ontology. Additionally, the new functionality enables mash-ups of automated data imports combined with manually created data presentations. The application of the suite of tools is demonstrated by importing drug discovery related data about rare diseases from Orphanet and acid dissociation constants from Wikidata. The RDFIO suite of tools is freely available for download via pharmb.io/project/rdfio .
CONCLUSIONS: Through a set of biomedical demonstrators, it is demonstrated how the new functionality enables a number of usage scenarios where the interoperability of SMW and the wider Semantic Web is leveraged for biomedical data sets, to create an easy to use and flexible platform for exploring and working with biomedical data.
PMID: 28870259 [PubMed - in process]
Mouse Genome Informatics (MGI) Resource: Genetic, Genomic, and Biological Knowledgebase for the Laboratory Mouse.
Mouse Genome Informatics (MGI) Resource: Genetic, Genomic, and Biological Knowledgebase for the Laboratory Mouse.
ILAR J. 2017 Jul 01;58(1):17-41
Authors: Eppig JT
Abstract
The Mouse Genome Informatics (MGI) Resource supports basic, translational, and computational research by providing high-quality, integrated data on the genetics, genomics, and biology of the laboratory mouse. MGI serves a strategic role for the scientific community in facilitating biomedical, experimental, and computational studies investigating the genetics and processes of diseases and enabling the development and testing of new disease models and therapeutic interventions. This review describes the nexus of the body of growing genetic and biological data and the advances in computer technology in the late 1980s, including the World Wide Web, that together launched the beginnings of MGI. MGI develops and maintains a gold-standard resource that reflects the current state of knowledge, provides semantic and contextual data integration that fosters hypothesis testing, continually develops new and improved tools for searching and analysis, and partners with the scientific community to assure research data needs are met. Here we describe one slice of MGI relating to the development of community-wide large-scale mutagenesis and phenotyping projects and introduce ways to access and use these MGI data. References and links to additional MGI aspects are provided.
PMID: 28838066 [PubMed - in process]
PathEdEx - Uncovering High-explanatory Visual Diagnostics Heuristics Using Digital Pathology and Multiscale Gaze Data.
PathEdEx - Uncovering High-explanatory Visual Diagnostics Heuristics Using Digital Pathology and Multiscale Gaze Data.
J Pathol Inform. 2017;8:29
Authors: Shin D, Kovalenko M, Ersoy I, Li Y, Doll D, Shyu CR, Hammer R
Abstract
BACKGROUND: Visual heuristics of pathology diagnosis is a largely unexplored area where reported studies only provided a qualitative insight into the subject. Uncovering and quantifying pathology visual and nonvisual diagnostic patterns have great potential to improve clinical outcomes and avoid diagnostic pitfalls.
METHODS: Here, we present PathEdEx, an informatics computational framework that incorporates whole-slide digital pathology imaging with multiscale gaze-tracking technology to create web-based interactive pathology educational atlases and to datamine visual and nonvisual diagnostic heuristics.
RESULTS: We demonstrate the capabilities of PathEdEx for mining visual and nonvisual diagnostic heuristics using the first PathEdEx volume of a hematopathology atlas. We conducted a quantitative study on the time dynamics of zooming and panning operations utilized by experts and novices to come to the correct diagnosis. We then performed association rule mining to determine sets of diagnostic factors that consistently result in a correct diagnosis, and studied differences in diagnostic strategies across different levels of pathology expertise using Markov chain (MC) modeling and MC Monte Carlo simulations. To perform these studies, we translated raw gaze points to high-explanatory semantic labels that represent pathology diagnostic clues. Therefore, the outcome of these studies is readily transformed into narrative descriptors for direct use in pathology education and practice.
CONCLUSION: PathEdEx framework can be used to capture best practices of pathology visual and nonvisual diagnostic heuristics that can be passed over to the next generation of pathologists and have potential to streamline implementation of precision diagnostics in precision medicine settings.
PMID: 28828200 [PubMed]
Owlready: Ontology-oriented programming in Python with automatic classification and high level constructs for biomedical ontologies.
Owlready: Ontology-oriented programming in Python with automatic classification and high level constructs for biomedical ontologies.
Artif Intell Med. 2017 Aug 14;:
Authors: Lamy JB
Abstract
OBJECTIVE: Ontologies are widely used in the biomedical domain. While many tools exist for the edition, alignment or evaluation of ontologies, few solutions have been proposed for ontology programming interface, i.e. for accessing and modifying an ontology within a programming language. Existing query languages (such as SPARQL) and APIs (such as OWLAPI) are not as easy-to-use as object programming languages are. Moreover, they provide few solutions to difficulties encountered with biomedical ontologies. Our objective was to design a tool for accessing easily the entities of an OWL ontology, with high-level constructs helping with biomedical ontologies.
METHODS: From our experience on medical ontologies, we identified two difficulties: (1) many entities are represented by classes (rather than individuals), but the existing tools do not permit manipulating classes as easily as individuals, (2) ontologies rely on the open-world assumption, whereas the medical reasoning must consider only evidence-based medical knowledge as true. We designed a Python module for ontology-oriented programming. It allows access to the entities of an OWL ontology as if they were objects in the programming language. We propose a simple high-level syntax for managing classes and the associated "role-filler" constraints. We also propose an algorithm for performing local closed world reasoning in simple situations.
RESULTS: We developed Owlready, a Python module for a high-level access to OWL ontologies. The paper describes the architecture and the syntax of the module version 2. It details how we integrated the OWL ontology model with the Python object model. The paper provides examples based on Gene Ontology (GO). We also demonstrate the interest of Owlready in a use case focused on the automatic comparison of the contraindications of several drugs. This use case illustrates the use of the specific syntax proposed for manipulating classes and for performing local closed world reasoning.
CONCLUSION: Owlready has been successfully used in a medical research project. It has been published as Open-Source software and then used by many other researchers. Future developments will focus on the support of vagueness and additional non-monotonic reasoning feature, and automatic dialog box generation.
PMID: 28818520 [PubMed - as supplied by publisher]
New tools and functions in Data-out activities at Protein Data Bank Japan (PDBj).
New tools and functions in Data-out activities at Protein Data Bank Japan (PDBj).
Protein Sci. 2017 Aug 17;:
Authors: Kinjo AR, Bekker GJ, Wako H, Endo S, Tsuchiya Y, Sato H, Nishi H, Kinoshita K, Suzuki H, Kawabata T, Yokochi M, Iwata T, Kobayashi N, Fujiwara T, Kurisu G, Nakamura H
Abstract
The Protein Data Bank Japan (PDBj), a member of the worldwide Protein Data Bank (wwPDB), accepts and processes the deposited data of experimentally determined biological macromolecular structures. In addition to archiving the PDB data in collaboration with the other wwPDB partners, PDBj also provides a wide range of original and unique services and tools, which are continuously improved and updated. Here, we report the new RDB PDBj Mine 2, the WebGL molecular viewer Molmil, the ProMode-Elastic server for normal mode analysis, a virtual reality system for the eF-site protein electrostatic molecular surfaces, the extensions of the Omokage search for molecular shape similarity, and the integration of PDBj and BMRB searches. This article is protected by copyright. All rights reserved.
PMID: 28815765 [PubMed - as supplied by publisher]
Retracted: Generating Personalized Web Search Using Semantic Context.
Retracted: Generating Personalized Web Search Using Semantic Context.
ScientificWorldJournal. 2017;2017:1295378
Authors: The Scientific World Journal
Abstract
[This retracts the article DOI: 10.1155/2015/462782.].
PMID: 28791315 [PubMed - in process]
Measuring Global Disease with Wikipedia: Success, Failure, and a Research Agenda.
Measuring Global Disease with Wikipedia: Success, Failure, and a Research Agenda.
Comput Support Coop Work. 2017 Feb-Mar;2017:1812-1834
Authors: Priedhorsky R, Osthus D, Daughton AR, Moran KR, Generous N, Fairchild G, Deshpande A, Del Valle SY
Abstract
Effective disease monitoring provides a foundation for effective public health systems. This has historically been accomplished with patient contact and bureaucratic aggregation, which tends to be slow and expensive. Recent internet-based approaches promise to be real-time and cheap, with few parameters. However, the question of when and how these approaches work remains open. We addressed this question using Wikipedia access logs and category links. Our experiments, replicable and extensible using our open source code and data, test the effect of semantic article filtering, amount of training data, forecast horizon, and model staleness by comparing across 6 diseases and 4 countries using thousands of individual models. We found that our minimal-configuration, language-agnostic article selection process based on semantic relatedness is effective for improving predictions, and that our approach is relatively insensitive to the amount and age of training data. We also found, in contrast to prior work, very little forecasting value, and we argue that this is consistent with theoretical considerations about the nature of forecasting. These mixed results lead us to propose that the currently observational field of internet-based disease surveillance must pivot to include theoretical models of information flow as well as controlled experiments based on simulations of disease.
PMID: 28782059 [PubMed - in process]
Protocol-Driven Decision Support within e-Referral Systems to Streamline Patient Consultation, Triaging and Referrals from Primary Care to Specialist Clinics.
Protocol-Driven Decision Support within e-Referral Systems to Streamline Patient Consultation, Triaging and Referrals from Primary Care to Specialist Clinics.
J Med Syst. 2017 Sep;41(9):139
Authors: Maghsoud-Lou E, Christie S, Abidi SR, Abidi SSR
Abstract
Patient referral is a protocol where the referring primary care physician refers the patient to a specialist for further treatment. The paper-based current referral process at times lead to communication and operational issues, resulting in either an unfulfilled referral request or an unnecessary referral request. Despite the availability of standardized referral protocols they are not readily applied because they are tedious and time-consuming, thus resulting in suboptimal referral requests. We present a semantic-web based Referral Knowledge Modeling and Execution Framework to computerize referral protocols, clinical guidelines and assessment tools in order to develop a computerized e-Referral system that offers protocol-based decision support to streamline and standardize the referral process. We have developed a Spinal Problem E-Referral (SPER) system that computerizes the Spinal Condition Consultation Protocol (SCCP) mandated by the Halifax Infirmary Division of Neurosurgery (Halifax, Canada) for referrals for spine related conditions (such as back pain). The SPER system executes the ontologically modeled SCCP to determine (i) patient's triaging option as per severity assessments stipulated by SCCP; and (b) clinical recommendations as per the clinical guidelines incorporated within SCCP. In operation, the SPER system identifies the critical cases and triages them for specialist referral, whereas for non-critical cases SPER system provides clinical guideline based recommendations to help the primary care physician effectively manage the patient. The SPER system has undergone a pilot usability study and was deemed to be easy to use by physicians with potential to improve the referral process within the Division of Neurosurgery at QEII Health Science Center, Halifax, Canada.
PMID: 28766103 [PubMed - in process]
Minimally inconsistent reasoning in Semantic Web.
Minimally inconsistent reasoning in Semantic Web.
PLoS One. 2017;12(7):e0181056
Authors: Zhang X
Abstract
Reasoning with inconsistencies is an important issue for Semantic Web as imperfect information is unavoidable in real applications. For this, different paraconsistent approaches, due to their capacity to draw as nontrivial conclusions by tolerating inconsistencies, have been proposed to reason with inconsistent description logic knowledge bases. However, existing paraconsistent approaches are often criticized for being too skeptical. To this end, this paper presents a non-monotonic paraconsistent version of description logic reasoning, called minimally inconsistent reasoning, where inconsistencies tolerated in the reasoning are minimized so that more reasonable conclusions can be inferred. Some desirable properties are studied, which shows that the new semantics inherits advantages of both non-monotonic reasoning and paraconsistent reasoning. A complete and sound tableau-based algorithm, called multi-valued tableaux, is developed to capture the minimally inconsistent reasoning. In fact, the tableaux algorithm is designed, as a framework for multi-valued DL, to allow for different underlying paraconsistent semantics, with the mere difference in the clash conditions. Finally, the complexity of minimally inconsistent description logic reasoning is shown on the same level as the (classical) description logic reasoning.
PMID: 28750030 [PubMed - in process]
Semantic Web, Reusable Learning Objects, Personal Learning Networks in Health: Key Pieces for Digital Health Literacy.
Semantic Web, Reusable Learning Objects, Personal Learning Networks in Health: Key Pieces for Digital Health Literacy.
Stud Health Technol Inform. 2017;238:219-222
Authors: Konstantinidis ST, Wharrad H, Windle R, Bamidis PD
Abstract
The knowledge existing in the World Wide Web is exponentially expanding, while continuous advancements in health sciences contribute to the creation of new knowledge. There are a lot of efforts trying to identify how the social connectivity can endorse patients' empowerment, while other studies look at the identification and the quality of online materials. However, emphasis has not been put on the big picture of connecting the existing resources with the patients "new habits" of learning through their own Personal Learning Networks. In this paper we propose a framework for empowering patients' digital health literacy adjusted to patients' currents needs by utilizing the contemporary way of learning through Personal Learning Networks, existing high quality learning resources and semantics technologies for interconnecting knowledge pieces. The framework based on the concept of knowledge maps for health as defined in this paper. Health Digital Literacy needs definitely further enhancement and the use of the proposed concept might lead to useful tools which enable use of understandable health trusted resources tailored to each person needs.
PMID: 28679928 [PubMed - in process]
Issues Associated With the Use of Semantic Web Technology in Knowledge Acquisition for Clinical Decision Support Systems: Systematic Review of the Literature.
Issues Associated With the Use of Semantic Web Technology in Knowledge Acquisition for Clinical Decision Support Systems: Systematic Review of the Literature.
JMIR Med Inform. 2017 Jul 05;5(3):e18
Authors: Zolhavarieh S, Parry D, Bai Q
Abstract
BACKGROUND: Knowledge-based clinical decision support system (KB-CDSS) can be used to help practitioners make diagnostic decisions. KB-CDSS may use clinical knowledge obtained from a wide variety of sources to make decisions. However, knowledge acquisition is one of the well-known bottlenecks in KB-CDSSs, partly because of the enormous growth in health-related knowledge available and the difficulty in assessing the quality of this knowledge as well as identifying the "best" knowledge to use. This bottleneck not only means that lower-quality knowledge is being used, but also that KB-CDSSs are difficult to develop for areas where expert knowledge may be limited or unavailable. Recent methods have been developed by utilizing Semantic Web (SW) technologies in order to automatically discover relevant knowledge from knowledge sources.
OBJECTIVE: The two main objectives of this study were to (1) identify and categorize knowledge acquisition issues that have been addressed through using SW technologies and (2) highlight the role of SW for acquiring knowledge used in the KB-CDSS.
METHODS: We conducted a systematic review of the recent work related to knowledge acquisition MeM for clinical decision support systems published in scientific journals. In this regard, we used the keyword search technique to extract relevant papers.
RESULTS: The retrieved papers were categorized based on two main issues: (1) format and data heterogeneity and (2) lack of semantic analysis. Most existing approaches will be discussed under these categories. A total of 27 papers were reviewed in this study.
CONCLUSIONS: The potential for using SW technology in KB-CDSS has only been considered to a minor extent so far despite its promise. This review identifies some questions and issues regarding use of SW technology for extracting relevant knowledge for a KB-CDSS.
PMID: 28679487 [PubMed - in process]
CodeMapper: semiautomatic coding of case definitions. A contribution from the ADVANCE project.
CodeMapper: semiautomatic coding of case definitions. A contribution from the ADVANCE project.
Pharmacoepidemiol Drug Saf. 2017 Jun 28;:
Authors: Becker BFH, Avillach P, Romio S, van Mulligen EM, Weibel D, Sturkenboom MCJM, Kors JA, ADVANCE consortium
Abstract
BACKGROUND: Assessment of drug and vaccine effects by combining information from different healthcare databases in the European Union requires extensive efforts in the harmonization of codes as different vocabularies are being used across countries. In this paper, we present a web application called CodeMapper, which assists in the mapping of case definitions to codes from different vocabularies, while keeping a transparent record of the complete mapping process.
METHODS: CodeMapper builds upon coding vocabularies contained in the Metathesaurus of the Unified Medical Language System. The mapping approach consists of three phases. First, medical concepts are automatically identified in a free-text case definition. Second, the user revises the set of medical concepts by adding or removing concepts, or expanding them to related concepts that are more general or more specific. Finally, the selected concepts are projected to codes from the targeted coding vocabularies. We evaluated the application by comparing codes that were automatically generated from case definitions by applying CodeMapper's concept identification and successive concept expansion, with reference codes that were manually created in a previous epidemiological study.
RESULTS: Automated concept identification alone had a sensitivity of 0.246 and positive predictive value (PPV) of 0.420 for reproducing the reference codes. Three successive steps of concept expansion increased sensitivity to 0.953 and PPV to 0.616.
CONCLUSIONS: Automatic concept identification in the case definition alone was insufficient to reproduce the reference codes, but CodeMapper's operations for concept expansion provide an effective, efficient, and transparent way for reproducing the reference codes.
PMID: 28657162 [PubMed - as supplied by publisher]
Knowledge Based Topic Model for Unsupervised Object Discovery and Localization.
Knowledge Based Topic Model for Unsupervised Object Discovery and Localization.
IEEE Trans Image Process. 2017 Jun 22;:
Authors: Niu Z, Hua G, Wang L, Gao X
Abstract
Unsupervised object discovery and localization is to discover some dominant object classes and localize all of object instances from a given image collection without any supervision. Previous work has attempted to tackle this problem with vanilla topic models such as Latent Dirichlet Allocation (LDA). However, in those methods no prior knowledge for the given image collection is exploited to facilitate object discovery. On the other hand, the topic models used in those methods suffer from the topic coherence issue-some inferred topics do not have clear meaning, which limits the final performance of object discovery. In this paper, prior knowledge in terms of the so-called Must-Links is exploited from Web images on the Internet. Furthermore, a novel knowledge-based topic model, called Latent Dirichlet Allocation with Mixture of Dirichlet Trees (LDA-MDT), is proposed to incorporate the Must-Links into topic modeling for object discovery. In particular, to better deal with the polysemy phenomenon of visual words, the Must-Link is re-defined as that one Must-Link only constrains one or some topic(s) instead of all topics, which leads to significantly improved topic coherence. Moreover, the Must-Links are built and grouped with respect to specific object classes, thus the Must-Links in our approach are semantic-specific, which allows to more efficiently exploit discriminative prior knowledge from Web images. Extensive experiments validated the efficiency of our proposed approach on several datasets. It is shown that our method significantly improves topic coherence and outperforms the unsupervised methods for object discovery and localization. In addition, compared to discriminative methods, the naturally existing object classes in the given image collection can be subtly discovered, which makes our approach well suited for realistic applications of unsupervised object discovery.
PMID: 28650813 [PubMed - as supplied by publisher]