Semantic Web
A novel framework for biomedical entity sense induction.
A novel framework for biomedical entity sense induction.
J Biomed Inform. 2018 Jun 20;:
Authors: Lossio-Ventura JA, Bian J, Jonquet C, Roche M, Teisseire M
Abstract
BACKGROUND: Rapid advancements in biomedical research have accelerated the number of relevant electronic documents published online, ranging from scholarly articles to news, blogs, and user-generated social media content. Nevertheless, the vast amount of this information is poorly organized, making it difficult to navigate. Emerging technologies such as ontologies and knowledge bases (KBs) could help organize and track the information associated with biomedical research developments. A major challenge in the automatic construction of ontologies and KBs is the identification of words with its respective sense(s) from a free-text corpus. Word-sense induction (WSI) is a task to automatically induce the different senses of a target word in the different contexts. In the last two decades, there have been several efforts on WSI. However, few methods are effective in biomedicine and life sciences.
METHODS: We developed a framework for biomedical entity sense induction using a mixture of natural language processing, supervised, and unsupervised learning methods with promising results. It is composed of three main steps: 1) a polysemy detection method to determine if a biomedical entity has many possible meanings; 2) a clustering quality index-based approach to predict the number of senses for the biomedical entity; and 3) a method to induce the concept(s) (i.e., senses) of the biomedical entity in a given context.
RESULTS: To evaluate our framework, we used the well-known MSH WSD polysemic dataset that contains 203 annotated ambiguous biomedical entities, where each entity is linked to 2 to 5 concepts. Our polysemy detection method obtained an F-measure of 98%. Second, our approach for predicting the number of senses achieved an F-measure of 93%. Finally, we induced the concepts of the biomedical entities based on a clustering algorithm and then extracted the keywords of reach cluster to represent the concept.
CONCLUSIONS: We have developed a framework for biomedical entity sense induction with promising results. Our study results can benefit a number of downstream applications, for example, help to resolve concept ambiguities when building Semantic Web KBs from biomedical text.
PMID: 29935347 [PubMed - as supplied by publisher]
Predicting drug-disease associations by using similarity constrained matrix factorization.
Predicting drug-disease associations by using similarity constrained matrix factorization.
BMC Bioinformatics. 2018 Jun 19;19(1):233
Authors: Zhang W, Yue X, Lin W, Wu W, Liu R, Huang F, Liu F
Abstract
BACKGROUND: Drug-disease associations provide important information for the drug discovery. Wet experiments that identify drug-disease associations are time-consuming and expensive. However, many drug-disease associations are still unobserved or unknown. The development of computational methods for predicting unobserved drug-disease associations is an important and urgent task.
RESULTS: In this paper, we proposed a similarity constrained matrix factorization method for the drug-disease association prediction (SCMFDD), which makes use of known drug-disease associations, drug features and disease semantic information. SCMFDD projects the drug-disease association relationship into two low-rank spaces, which uncover latent features for drugs and diseases, and then introduces drug feature-based similarities and disease semantic similarity as constraints for drugs and diseases in low-rank spaces. Different from the classic matrix factorization technique, SCMFDD takes the biological context of the problem into account. In computational experiments, the proposed method can produce high-accuracy performances on benchmark datasets, and outperform existing state-of-the-art prediction methods when evaluated by five-fold cross validation and independent testing.
CONCLUSION: We developed a user-friendly web server by using known associations collected from the CTD database, available at http://www.bioinfotech.cn/SCMFDD/ . The case studies show that the server can find out novel associations, which are not included in the CTD database.
PMID: 29914348 [PubMed - in process]
A Surveillance Infrastructure for Malaria Analytics: Provisioning Data Access and Preservation of Interoperability.
A Surveillance Infrastructure for Malaria Analytics: Provisioning Data Access and Preservation of Interoperability.
JMIR Public Health Surveill. 2018 Jun 15;4(2):e10218
Authors: Al Manir MS, Brenas JH, Baker CJ, Shaban-Nejad A
Abstract
BACKGROUND: According to the World Health Organization, malaria surveillance is weakest in countries and regions with the highest malaria burden. A core obstacle is that the data required to perform malaria surveillance are fragmented in multiple data silos distributed across geographic regions. Furthermore, consistent integrated malaria data sources are few, and a low degree of interoperability exists between them. As a result, it is difficult to identify disease trends and to plan for effective interventions.
OBJECTIVE: We propose the Semantics, Interoperability, and Evolution for Malaria Analytics (SIEMA) platform for use in malaria surveillance based on semantic data federation. Using this approach, it is possible to access distributed data, extend and preserve interoperability between multiple dynamic distributed malaria sources, and facilitate detection of system changes that can interrupt mission-critical global surveillance activities.
METHODS: We used Semantic Automated Discovery and Integration (SADI) Semantic Web Services to enable data access and improve interoperability, and the graphical user interface-enabled semantic query engine HYDRA to implement the target queries typical of malaria programs. We implemented a custom algorithm to detect changes to community-developed terminologies, data sources, and services that are core to SIEMA. This algorithm reports to a dashboard. Valet SADI is used to mitigate the impact of changes by rebuilding affected services.
RESULTS: We developed a prototype surveillance and change management platform from a combination of third-party tools, community-developed terminologies, and custom algorithms. We illustrated a methodology and core infrastructure to facilitate interoperable access to distributed data sources using SADI Semantic Web services. This degree of access makes it possible to implement complex queries needed by our user community with minimal technical skill. We implemented a dashboard that reports on terminology changes that can render the services inactive, jeopardizing system interoperability. Using this information, end users can control and reactively rebuild services to preserve interoperability and minimize service downtime.
CONCLUSIONS: We introduce a framework suitable for use in malaria surveillance that supports the creation of flexible surveillance queries across distributed data resources. The platform provides interoperable access to target data sources, is domain agnostic, and with updates to core terminological resources is readily transferable to other surveillance activities. A dashboard enables users to review changes to the infrastructure and invoke system updates. The platform significantly extends the range of functionalities offered by malaria information systems, beyond the state-of-the-art.
PMID: 29907554 [PubMed]
Novel phenotype-disease matching tool for rare genetic diseases.
Novel phenotype-disease matching tool for rare genetic diseases.
Genet Med. 2018 Jun 12;:
Authors: Chen J, Xu H, Jegga A, Zhang K, White PS, Zhang G
Abstract
PURPOSE: To improve the accuracy of matching rare genetic diseases based on patient's phenotypes.
METHODS: We introduce new methods to prioritize diagnosis of genetic diseases based on integrated semantic similarity (method 1) and ontological overlap (method 2) between the phenotypes expressed by a patient and phenotypes annotated to known diseases.
RESULTS: We evaluated the performance of our methods by two sets of simulated data and one set of patient's data derived from electronic health records. We demonstrated that the two methods achieved significantly improved performance compared with previous methods in correctly prioritizing candidate diseases in all of the three sets. Our methods are freely available as a web application ( https://gddp.
RESEARCH: cchmc.org/ ) to aid diagnosis of genetic diseases.
CONCLUSION: Our methods can capture the diagnostic information embedded in the phenotype ontology, consider all phenotypes exhibited by a patient, and are more robust than the existing methods when phenotypes are incorrectly or imprecisely specified. These methods can assist the diagnosis of rare genetic diseases and help the interpretation of the results of DNA tests.
PMID: 29895857 [PubMed - as supplied by publisher]
An Observation Capability Semantic-Associated Approach to the Selection of Remote Sensing Satellite Sensors: A Case Study of Flood Observations in the Jinsha River Basin.
An Observation Capability Semantic-Associated Approach to the Selection of Remote Sensing Satellite Sensors: A Case Study of Flood Observations in the Jinsha River Basin.
Sensors (Basel). 2018 May 21;18(5):
Authors: Hu C, Li J, Lin X, Chen N, Yang C
Abstract
Observation schedules depend upon the accurate understanding of a single sensor’s observation capability and the interrelated observation capability information on multiple sensors. The general ontologies for sensors and observations are abundant. However, few observation capability ontologies for satellite sensors are available, and no study has described the dynamic associations among the observation capabilities of multiple sensors used for integrated observational planning. This limitation results in a failure to realize effective sensor selection. This paper develops a sensor observation capability association (SOCA) ontology model that is resolved around the task-sensor-observation capability (TSOC) ontology pattern. The pattern is developed considering the stimulus-sensor-observation (SSO) ontology design pattern, which focuses on facilitating sensor selection for one observation task. The core aim of the SOCA ontology model is to achieve an observation capability semantic association. A prototype system called SemOCAssociation was developed, and an experiment was conducted for flood observations in the Jinsha River basin in China. The results of this experiment verified that the SOCA ontology based association method can help sensor planners intuitively and accurately make evidence-based sensor selection decisions for a given flood observation task, which facilitates efficient and effective observational planning for flood satellite sensors.
PMID: 29883425 [PubMed - in process]
Understanding the health economic burden of patients with tuberous sclerosis complex (TSC) with epilepsy: a retrospective cohort study in the UK Clinical Practice Research Datalink (CPRD).
Understanding the health economic burden of patients with tuberous sclerosis complex (TSC) with epilepsy: a retrospective cohort study in the UK Clinical Practice Research Datalink (CPRD).
BMJ Open. 2017 Oct 05;7(10):e015236
Authors: Shepherd C, Koepp M, Myland M, Patel K, Miglio C, Siva V, Gray E, Neary M
Abstract
INTRODUCTION: Epilepsy is highly prevalent in tuberous sclerosis complex (TSC), a multi-system genetic disorder. The clinical and economic burden of this condition is expected to be substantial due to treatment challenges, debilitating co-morbidities and the relationship between TSC-related manifestations. This study estimated healthcare resource utilisation (HCRU) and costs for patients with TSC with epilepsy (TSC+E) in the UK.
METHODS: Patients with TSC+E in the Clinical Practice Research Datalink (CPRD) linked to Hospital Episodes Statistics were identified from April 1997 to March 2012. Clinical data were extracted over the entire history, and costs were reported over the most recent 3-year period. HCRU was compared with a matched Comparator cohort, and the key cost drivers were identified by regression modelling.
RESULTS: In total, 209 patients with TSC+E were identified, of which 40% recorded ≥2 other primary organ system manifestations and 42% had learning disability. Treatment with ≥2 concomitant antiepileptic drugs (AEDs) was prevalent (60%), potentially suggesting refractory epilepsy. Notwithstanding, many patients with TSC+E (12%) had no record of AED use in their entire history, which may indicate undertreatment for these patients.Brain surgery was recorded in 12% of patients. Routine electroencephalography and MRI were infrequently performed (30% of patients), yet general practitioner visits, hospitalisations and outpatient visits were more frequent in patients with TSC+E than the Comparator. This translated to threefold higher clinical costs (£14 335 vs £4448), which significantly increased with each additional primary manifestation (p<0.0001).
CONCLUSIONS: Patients with TSC+E have increased HCRU compared with the general CPRD population, likely related to manifestations in several organ systems, substantial cognitive impairment and severe epilepsy, which is challenging to treat and may be intractable. Disease surveillance and testing appears to be inadequate with few treatments trialled.Multidisciplinary care in TSC clinics with specialist neurologist input may alleviate some of the morbidity of patients, but more innovative treatment and management options should be sought.
PMID: 28982809 [PubMed - indexed for MEDLINE]
Trends, causes and timing of 30-day readmissions after hospitalization for heart failure: 11-year population-based analysis with linked data.
Trends, causes and timing of 30-day readmissions after hospitalization for heart failure: 11-year population-based analysis with linked data.
Int J Cardiol. 2017 Dec 01;248:246-251
Authors: Fernandez-Gasso L, Hernando-Arizaleta L, Palomar-Rodríguez JA, Abellán-Pérez MV, Pascual-Figal DA
Abstract
BACKGROUND: Reliable data are necessary if the burden of early readmissions following hospitalization for heart failure (HF) is to be addressed. We studied unplanned 30-day readmissions, their causes and timing over an 11-year period, using population-based linked data.
METHODS: All hospitalizations from 2003 to 2013 were analyzed by using administrative linked data based on the Minimum Basic Set discharge registry of the Department of Health (Region of Murcia, Spain). Index hospitalizations with HF as principal diagnosis (n=27,581) were identified. Transfers between centers were merged into one discharge. Readmissions were defined as unplanned admissions to any hospital within 30-days after discharge.
RESULTS: In the 2003-2013 period, 30-day readmission rates had a relative mean annual growth of +1.36%, increasing from 17.6% to 22.1%, with similar trends for cardiovascular and non-cardiovascular causes. The figure of 22.1% decreased to 19.8% when only same-hospital readmissions were considered. Most readmissions were due to cardiovascular causes (60%), HF being the most common single cause (34%). The timing of readmission shows an early peak on the fourth day post discharge (+13.29%) due to causes other than HF, followed by a gradual decline (-3.32%); readmission for HF decreased steadily from the first day (-2.22%). Readmission for HF (12.7%) or non-cardiovascular causes (13.3%) had higher in-hospital mortality rates than the index hospitalization (9.2%, p<0.001). Age and comorbidity burden were the main predictors of any readmission, but the performance of a predictive model was poor.
CONCLUSION: These findings support the need for population-based strategies to reduce the burden of early-unplanned readmissions.
PMID: 28801153 [PubMed - indexed for MEDLINE]
The National Sleep Research Resource: towards a sleep data commons.
The National Sleep Research Resource: towards a sleep data commons.
J Am Med Inform Assoc. 2018 May 31;:
Authors: Zhang GQ, Cui L, Mueller R, Tao S, Kim M, Rueschman M, Mariani S, Mobley D, Redline S
Abstract
Objective: The gold standard for diagnosing sleep disorders is polysomnography, which generates extensive data about biophysical changes occurring during sleep. We developed the National Sleep Research Resource (NSRR), a comprehensive system for sharing sleep data. The NSRR embodies elements of a data commons aimed at accelerating research to address critical questions about the impact of sleep disorders on important health outcomes.
Approach: We used a metadata-guided approach, with a set of common sleep-specific terms enforcing uniform semantic interpretation of data elements across three main components: (1) annotated datasets; (2) user interfaces for accessing data; and (3) computational tools for the analysis of polysomnography recordings. We incorporated the process for managing dataset-specific data use agreements, evidence of Institutional Review Board review, and the corresponding access control in the NSRR web portal. The metadata-guided approach facilitates structural and semantic interoperability, ultimately leading to enhanced data reusability and scientific rigor.
Results: The authors curated and deposited retrospective data from 10 large, NIH-funded sleep cohort studies, including several from the Trans-Omics for Precision Medicine (TOPMed) program, into the NSRR. The NSRR currently contains data on 26 808 subjects and 31 166 signal files in European Data Format. Launched in April 2014, over 3000 registered users have downloaded over 130 terabytes of data.
Conclusions: The NSRR offers a use case and an example for creating a full-fledged data commons. It provides a single point of access to analysis-ready physiological signals from polysomnography obtained from multiple sources, and a wide variety of clinical data to facilitate sleep research.
PMID: 29860441 [PubMed - as supplied by publisher]
Spark-MCA: Large-scale, Exhaustive Formal Concept Analysis for Evaluating the Semantic Completeness of SNOMED CT.
Spark-MCA: Large-scale, Exhaustive Formal Concept Analysis for Evaluating the Semantic Completeness of SNOMED CT.
AMIA Annu Symp Proc. 2017;2017:1931-1940
Authors: Wei Z, Licong C, Guo-Qiang Z
Abstract
The completeness of a medical terminology system consists of two parts: complete content coverage and complete semantics. In this paper, we focus on semantic completeness and present a scalable approach, called Spark-MCA, for evaluating the semantic completeness of SNOMED CT. We formulate the SNOMED CT contents into an FCA-based formal context, in which SNOMED CT concepts are used for extents, while their attributes are used as intents. We applied Spark-MCA to the 201403 US edition of SNOMED CT to exhaustively compute all the formal concepts and sub concept relationships in about 2 hours with 96 processors using an Amazon Web Service cluster. We found a total of 799,868 formal concepts, within which 500,583 are not contained in the 201403 release. We compared these concepts with the cumulative addition of 22,687 concepts from the 5 "delta" files from the 201403 release to the 201609 release. 3,231 matches were found between those suggested by FCA and those from cumulative concept addition by the SNOMED CT Editorial Panel. This result provides encouraging evidence that our approach could be useful for enhancing the semantic completeness of SNOMED CT.
PMID: 29854265 [PubMed - in process]
Modeling Contextual Knowledge for Clinical Decision Support.
Modeling Contextual Knowledge for Clinical Decision Support.
AMIA Annu Symp Proc. 2017;2017:1617-1624
Authors: Sordo M, Tokachichu P, Vitale CJ, Maviglia SM, Rocha RA
Abstract
In theory, the logic of decision rules should be atomic. In practice, this is not always possible; initially simple logic statements tend to be overloaded with additional conditions restricting the scope of such rules. By doing so, the original logic soon becomes encumbered with contextual knowledge. Contextual knowledge is re-usable on its own and could be modeled separately from the logic of a rule without losing the intended functionality. We model constraints to explicitly define the context where knowledge of decision rules is actionable. We borrowed concepts from Semantic Web, Complex Adaptive Systems, and Contextual Reasoning. The proposed approach provides the means for identifying and modeling contextual knowledge in a simple, sound manner. The methodology presented herein facilitates rule authoring, fosters consistency in rules implementation and maintenance; facilitates developing authoritative knowledge repositories to promote quality, safety and efficacy of healthcare; and paves the road for future work in knowledge discovery.
PMID: 29854232 [PubMed - in process]
Reconciliation of multiple guidelines for decision support: a case study on the multidisciplinary management of breast cancer within the DESIREE project.
Reconciliation of multiple guidelines for decision support: a case study on the multidisciplinary management of breast cancer within the DESIREE project.
AMIA Annu Symp Proc. 2017;2017:1527-1536
Authors: Séroussi B, Guézennec G, Lamy JB, Muro N, Larburu N, Sekar BD, Prebet C, Bouaud J
Abstract
Breast cancer is the most common cancer among women. DESIREE is a European project which aims at developing web-based services for the management of primary breast cancer by multidisciplinary breast units (BUs). We describe the guideline-based decision support system (GL-DSS) of the project. Various breast cancer clinical practice guidelines (CPGs) have been selected to be concurrently applied to provide state-of-the-art patient-specific recommendations. The aim is to reconcile CPG recommendations with the objective of complementarity to enlarge the number of clinical situations covered by the GL-DSS. Input and output data exchange with the GL-DSS is performed using FHIR. We used a knowledge model of the domain as an ontology on which relies the reasoning process performed by rules that encode the selected CPGs. Semantic web tools were used, notably the Euler/EYE inference engine, to implement the GL-DSS. "Rainbow boxes" are a synthetic tabular display used to visualize the inferred recommendations.
PMID: 29854222 [PubMed - in process]
Evaluation of Semantic Web Technologies for Storing Computable Definitions of Electronic Health Records Phenotyping Algorithms.
Evaluation of Semantic Web Technologies for Storing Computable Definitions of Electronic Health Records Phenotyping Algorithms.
AMIA Annu Symp Proc. 2017;2017:1352-1361
Authors: Papež V, Denaxas S, Hemingway H
Abstract
Electronic Health Records are electronic data generated during or as a byproduct of routine patient care. Structured, semi-structured and unstructured EHR offer researchers unprecedented phenotypic breadth and depth and have the potential to accelerate the development of precision medicine approaches at scale. A main EHR use-case is defining phenotyping algorithms that identify disease status, onset and severity. Phenotyping algorithms utilize diagnoses, prescriptions, laboratory tests, symptoms and other elements in order to identify patients with or without a specific trait. No common standardized, structured, computable format exists for storing phenotyping algorithms. The majority of algorithms are stored as human-readable descriptive text documents making their translation to code challenging due to their inherent complexity and hinders their sharing and re-use across the community. In this paper, we evaluate the two key Semantic Web Technologies, the Web Ontology Language and the Resource Description Framework, for enabling computable representations of EHR-driven phenotyping algorithms.
PMID: 29854204 [PubMed - in process]
Mechanism-based Pharmacovigilance over the Life Sciences Linked Open Data Cloud.
Mechanism-based Pharmacovigilance over the Life Sciences Linked Open Data Cloud.
AMIA Annu Symp Proc. 2017;2017:1014-1023
Authors: Kamdar MR, Musen MA
Abstract
Adverse drug reactions (ADR) result in significant morbidity and mortality in patients, and a substantial proportion of these ADRs are caused by drug-drug interactions (DDIs). Pharmacovigilance methods are used to detect unanticipated DDIs and ADRs by mining Spontaneous Reporting Systems, such as the US FDA Adverse Event Reporting System (FAERS). However, these methods do not provide mechanistic explanations for the discovered drug-ADR associations in a systematic manner. In this paper, we present a systems pharmacology-based approach to perform mechanism-based pharmacovigilance. We integrate data and knowledge from four different sources using Semantic Web Technologies and Linked Data principles to generate a systems network. We present a network-based Apriori algorithm for association mining in FAERS reports. We evaluate our method against existing pharmacovigilance methods for three different validation sets. Our method has AUROC statistics of 0.7-0.8, similar to current methods, and event-specific thresholds generate AUROC statistics greater than 0.75 for certain ADRs. Finally, we discuss the benefits of using Semantic Web technologies to attain the objectives for mechanism-based pharmacovigilance.
PMID: 29854169 [PubMed - in process]
Enhanced functionalities for annotating and indexing clinical text with the NCBO Annotator.
Enhanced functionalities for annotating and indexing clinical text with the NCBO Annotator.
Bioinformatics. 2018 Jun 01;34(11):1962-1965
Authors: Tchechmedjiev A, Abdaoui A, Emonet V, Melzi S, Jonnagaddala J, Jonquet C
Abstract
Summary: Second use of clinical data commonly involves annotating biomedical text with terminologies and ontologies. The National Center for Biomedical Ontology Annotator is a frequently used annotation service, originally designed for biomedical data, but not very suitable for clinical text annotation. In order to add new functionalities to the NCBO Annotator without hosting or modifying the original Web service, we have designed a proxy architecture that enables seamless extensions by pre-processing of the input text and parameters, and post processing of the annotations. We have then implemented enhanced functionalities for annotating and indexing free text such as: scoring, detection of context (negation, experiencer, temporality), new output formats and coarse-grained concept recognition (with UMLS Semantic Groups). In this paper, we present the NCBO Annotator+, a Web service which incorporates these new functionalities as well as a small set of evaluation results for concept recognition and clinical context detection on two standard evaluation tasks (Clef eHealth 2017, SemEval 2014).
Availability and implementation: The Annotator+ has been successfully integrated into the SIFR BioPortal platform-an implementation of NCBO BioPortal for French biomedical terminologies and ontologies-to annotate English text. A Web user interface is available for testing and ontology selection (http://bioportal.lirmm.fr/ncbo_annotatorplus); however the Annotator+ is meant to be used through the Web service application programming interface (http://services.bioportal.lirmm.fr/ncbo_annotatorplus). The code is openly available, and we also provide a Docker packaging to enable easy local deployment to process sensitive (e.g. clinical) data in-house (https://github.com/sifrproject).
Contact: andon.tchechmedjiev@lirmm.fr.
Supplementary information: Supplementary data are available at Bioinformatics online.
PMID: 29846492 [PubMed - in process]
Web pages: What can you see in a single fixation?
Web pages: What can you see in a single fixation?
Cogn Res Princ Implic. 2018;3(1):14
Authors: Jahanian A, Keshvari S, Rosenholtz R
Abstract
Research in human vision suggests that in a single fixation, humans can extract a significant amount of information from a natural scene, e.g. the semantic category, spatial layout, and object identities. This ability is useful, for example, for quickly determining location, navigating around obstacles, detecting threats, and guiding eye movements to gather more information. In this paper, we ask a new question: What can we see at a glance at a web page - an artificial yet complex "real world" stimulus? Is it possible to notice the type of website, or where the relevant elements are, with only a glimpse? We find that observers, fixating at the center of a web page shown for only 120 milliseconds, are well above chance at classifying the page into one of ten categories. Furthermore, this ability is supported in part by text that they can read at a glance. Users can also understand the spatial layout well enough to reliably localize the menu bar and to detect ads, even though the latter are often camouflaged among other graphical elements. We discuss the parallels between web page gist and scene gist, and the implications of our findings for both vision science and human-computer interaction.
PMID: 29774229 [PubMed]
Standard Lexicons, Coding Systems and Ontologies for Interoperability and Semantic Computation in Imaging.
Standard Lexicons, Coding Systems and Ontologies for Interoperability and Semantic Computation in Imaging.
J Digit Imaging. 2018 May 03;:
Authors: Wang KC
Abstract
Standard clinical terms, codes, and ontologies promote clarity and interoperability. Within radiology, there is a variety of relevant content resources, tools and technologies. These provide the basis for fundamental imaging workflows such as reporting and billing, and also facilitate a range of applications in quality improvement and research. This article reviews the key characteristics of lexicons, coding systems, and ontologies. A number of standards are described, including International Classification of Diseases-10-Clinical Modification (ICD-10-CM), Current Procedural Terminology (CPT), Systematized Nomenclature of Medicine-Clinical Terms (SNOMED CT), Logical Observation Identifiers Names and Codes (LOINC), and RadLex. Tools for accessing this material are reviewed, such as the National Center for Biomedical Ontology BioPortal system. Web services are discussed as a mechanism for semantic application development. Several example systems, workflows, and research applications using semantic technology are also surveyed.
PMID: 29725962 [PubMed - as supplied by publisher]
The semantic distance task: Quantifying semantic distance with semantic network path length.
The semantic distance task: Quantifying semantic distance with semantic network path length.
J Exp Psychol Learn Mem Cogn. 2017 Sep;43(9):1470-1489
Authors: Kenett YN, Levi E, Anaki D, Faust M
Abstract
Semantic distance is a determining factor in cognitive processes, such as semantic priming, operating upon semantic memory. The main computational approach to compute semantic distance is through latent semantic analysis (LSA). However, objections have been raised against this approach, mainly in its failure at predicting semantic priming. We propose a novel approach to computing semantic distance, based on network science methodology. Path length in a semantic network represents the amount of steps needed to traverse from 1 word in the network to the other. We examine whether path length can be used as a measure of semantic distance, by investigating how path length affect performance in a semantic relatedness judgment task and recall from memory. Our results show a differential effect on performance: Up to 4 steps separating between word-pairs, participants exhibit an increase in reaction time (RT) and decrease in the percentage of word-pairs judged as related. From 4 steps onward, participants exhibit a significant decrease in RT and the word-pairs are dominantly judged as unrelated. Furthermore, we show that as path length between word-pairs increases, success in free- and cued-recall decreases. Finally, we demonstrate how our measure outperforms computational methods measuring semantic distance (LSA and positive pointwise mutual information) in predicting participants RT and subjective judgments of semantic strength. Thus, we provide a computational alternative to computing semantic distance. Furthermore, this approach addresses key issues in cognitive theory, namely the breadth of the spreading activation process and the effect of semantic distance on memory retrieval. (PsycINFO Database Record
PMID: 28240936 [PubMed - indexed for MEDLINE]
Radiation Oncology Terminology Linker: A Step Towards a Linked Data Knowledge Base.
Radiation Oncology Terminology Linker: A Step Towards a Linked Data Knowledge Base.
Stud Health Technol Inform. 2018;247:855-859
Authors: Lustberg T, van Soest J, Fick P, Fijten R, Hendriks T, Puts S, Dekker A
Abstract
Performing image feature extraction in radiation oncology is often dependent on the organ and tumor delineations provided by clinical staff. These delineation names are free text DICOM metadata fields resulting in undefined information, which requires effort to use in large-scale image feature extraction efforts. In this work we present a scale-able solution to overcome these naming convention challenges with a REST service using Semantic Web technology to convert this information to linked data. As a proof of concept an open source software is used to compute radiation oncology image features. The results of this work can be found in a public Bitbucket repository.
PMID: 29678082 [PubMed - in process]
Combining the Generic Entity-Attribute-Value Model and Terminological Models into a Common Ontology to Enable Data Integration and Decision Support.
Combining the Generic Entity-Attribute-Value Model and Terminological Models into a Common Ontology to Enable Data Integration and Decision Support.
Stud Health Technol Inform. 2018;247:541-545
Authors: Bouaud J, Guézennec G, Séroussi B
Abstract
The integration of clinical information models and termino-ontological models into a unique ontological framework is highly desirable for it facilitates data integration and management using the same formal mechanisms for both data concepts and information model components. This is particularly true for knowledge-based decision support tools that aim to take advantage of all facets of semantic web technologies in merging ontological reasoning, concept classification, and rule-based inferences. We present an ontology template that combines generic data model components with (parts of) existing termino-ontological resources. The approach is developed for the guideline-based decision support module on breast cancer management within the DESIREE European project. The approach is based on the entity attribute value model and could be extended to other domains.
PMID: 29678019 [PubMed - in process]
Exploring Semantic Data Federation to Enable Malaria Surveillance Queries.
Exploring Semantic Data Federation to Enable Malaria Surveillance Queries.
Stud Health Technol Inform. 2018;247:6-10
Authors: Brenas JH, Al Manir MS, Zinszer K, Baker CJO, Shaban-Nejad A
Abstract
Malaria is an infectious disease affecting people across tropical countries. In order to devise efficient interventions, surveillance experts need to be able to answer increasingly complex queries integrating information coming from repositories distributed all over the globe. This, in turn, requires extraordinary coding abilities that cannot be expected from non-technical surveillance experts. In this paper, we present a deployment of Semantic Automated Discovery and Integration (SADI) Web services for the federation and querying of malaria data. More than 10 services were created to answer an example query requiring data coming from various sources. Our method assists surveillance experts in formulating their queries and gaining access to the answers they need.
PMID: 29677912 [PubMed - in process]