Semantic Web

semantic web: Latest results from PubMed

URL: https://pubmed.ncbi.nlm.nih.gov/rss-feed/?feed_id=1JEkdnYNUQx_r7xNn8x36OtSBbORVE88SJ4HzAiB058dOQT5D1&ff=20250813020714&utm_medium=rss&v=2.18.0.post9+e462414&utm_source=Other&utm_content=1JEkdnYNUQx_r7xNn8x36OtSBbORVE88SJ4HzAiB058dOQT5D1

Updated: 1 hour 2 min ago

Lessons learned from using linked administrative data to evaluate the Family Nurse Partnership in England and Scotland

Wed, 2023-09-06 06:00

Int J Popul Data Sci. 2023 May 11;8(1):2113. doi: 10.23889/ijpds.v8i1.2113. eCollection 2023.

ABSTRACT

INTRODUCTION: "Big data" - including linked administrative data - can be exploited to evaluate interventions for maternal and child health, providing time- and cost-effective alternatives to randomised controlled trials. However, using these data to evaluate population-level interventions can be challenging.

OBJECTIVES: We aimed to inform future evaluations of complex interventions by describing sources of bias, lessons learned, and suggestions for improvements, based on two observational studies using linked administrative data from health, education and social care sectors to evaluate the Family Nurse Partnership (FNP) in England and Scotland.

METHODS: We first considered how different sources of potential bias within the administrative data could affect results of the evaluations. We explored how each study design addressed these sources of bias using maternal confounders captured in the data. We then determined what additional information could be captured at each step of the complex intervention to enable analysts to minimise bias and maximise comparability between intervention and usual care groups, so that any observed differences can be attributed to the intervention.

RESULTS: Lessons learned include the need for i) detailed data on intervention activity (dates/geography) and usual care; ii) improved information on data linkage quality to accurately characterise control groups; iii) more efficient provision of linked data to ensure timeliness of results; iv) better measurement of confounding characteristics affecting who is eligible, approached and enrolled.

CONCLUSIONS: Linked administrative data are a valuable resource for evaluations of the FNP national programme and other complex population-level interventions. However, information on local programme delivery and usual care are required to account for biases that characterise those who receive the intervention, and to inform understanding of mechanisms of effect. National, ongoing, robust evaluations of complex public health evaluations would be more achievable if programme implementation was integrated with improved national and local data collection, and robust quasi-experimental designs.

PMID:37670953 | PMC:PMC10476150 | DOI:10.23889/ijpds.v8i1.2113

Categories: Literature Watch

Development of an integrated and inferenceable RDF database of glycan, pathogen and disease resources

Wed, 2023-09-06 06:00

Sci Data. 2023 Sep 6;10(1):582. doi: 10.1038/s41597-023-02442-2.

ABSTRACT

Glycans are known to play extremely important roles in infections by viruses and pathogens. In fact, the SARS-CoV-2 virus has been shown to have evolved due to a single change in glycosylation. However, data resources on glycans, pathogens and diseases are not well organized. To accurately obtain such information from these various resources, we have constructed a foundation for discovering glycan and virus interaction data using Semantic Web technologies to be able to semantically integrate such heterogeneous data. Here, we created an ontology to encapsulate the semantics of virus-glycan interactions, and used Resource Description Framework (RDF) to represent the data we obtained from non-RDF related databases and data associated with literature. These databases include PubChem, SugarBind, and PSICQUIC, which made it possible to refer to other RDF resources such as UniProt and GlyTouCan. We made these data publicly available as open data and provided a service that allows anyone to freely perform searches using SPARQL. In addition, the RDF resources created in this study are available at the GlyCosmos Portal.

PMID:37673902 | DOI:10.1038/s41597-023-02442-2

Categories: Literature Watch

PO2/TransformON, an ontology for data integration on food, feed, bioproducts and biowaste engineering

Mon, 2023-09-04 06:00

NPJ Sci Food. 2023 Sep 4;7(1):47. doi: 10.1038/s41538-023-00221-2.

ABSTRACT

We are witnessing an acceleration of the global drive to converge consumption and production patterns towards a more circular and sustainable approach to the food system. To address the challenge of reconnecting agriculture, environment, food and health, collections of large datasets must be exploited. However, building high-capacity data-sharing networks means unlocking the information silos that are caused by a multiplicity of local data dictionaries. To solve the data harmonization problem, we proposed an ontology on food, feed, bioproducts, and biowastes engineering for data integration in a circular bioeconomy and nexus-oriented approach. This ontology is based on a core model representing a generic process, the Process and Observation Ontology (PO2), which has been specialized to provide the vocabulary necessary to describe any biomass transformation process and to characterize the food, bioproducts, and wastes derived from these processes. Much of this vocabulary comes from transforming authoritative references such as the European food classification system (FoodEx2), the European Waste Catalogue, and other international nomenclatures into a semantic, world wide web consortium (W3C) format that provides system interoperability and software-driven intelligence. We showed the relevance of this new domain ontology PO2/TransformON through several concrete use cases in the fields of process engineering, bio-based composite making, food ecodesign, and relations with consumer's perception and preferences. Further works will aim to align with other ontologies to create an ontology network for bridging the gap between upstream and downstream processes in the food system.

PMID:37666867 | DOI:10.1038/s41538-023-00221-2

Categories: Literature Watch

Automatic transparency evaluation for open knowledge extraction systems

Thu, 2023-08-31 06:00

J Biomed Semantics. 2023 Aug 31;14(1):12. doi: 10.1186/s13326-023-00293-9.

ABSTRACT

BACKGROUND: This paper proposes Cyrus, a new transparency evaluation framework, for Open Knowledge Extraction (OKE) systems. Cyrus is based on the state-of-the-art transparency models and linked data quality assessment dimensions. It brings together a comprehensive view of transparency dimensions for OKE systems. The Cyrus framework is used to evaluate the transparency of three linked datasets, which are built from the same corpus by three state-of-the-art OKE systems. The evaluation is automatically performed using a combination of three state-of-the-art FAIRness (Findability, Accessibility, Interoperability, Reusability) assessment tools and a linked data quality evaluation framework, called Luzzu. This evaluation includes six Cyrus data transparency dimensions for which existing assessment tools could be identified. OKE systems extract structured knowledge from unstructured or semi-structured text in the form of linked data. These systems are fundamental components of advanced knowledge services. However, due to the lack of a transparency framework for OKE, most OKE systems are not transparent. This means that their processes and outcomes are not understandable and interpretable. A comprehensive framework sheds light on different aspects of transparency, allows comparison between the transparency of different systems by supporting the development of transparency scores, gives insight into the transparency weaknesses of the system, and ways to improve them. Automatic transparency evaluation helps with scalability and facilitates transparency assessment. The transparency problem has been identified as critical by the European Union Trustworthy Artificial Intelligence (AI) guidelines. In this paper, Cyrus provides the first comprehensive view of transparency dimensions for OKE systems by merging the perspectives of the FAccT (Fairness, Accountability, and Transparency), FAIR, and linked data quality research communities.

RESULTS: In Cyrus, data transparency includes ten dimensions which are grouped in two categories. In this paper, six of these dimensions, i.e., provenance, interpretability, understandability, licensing, availability, interlinking have been evaluated automatically for three state-of-the-art OKE systems, using the state-of-the-art metrics and tools. Covid-on-the-Web is identified to have the highest mean transparency.

CONCLUSIONS: This is the first research to study the transparency of OKE systems that provides a comprehensive set of transparency dimensions spanning ethics, trustworthy AI, and data quality approaches to transparency. It also demonstrates how to perform automated transparency evaluation that combines existing FAIRness and linked data quality assessment tools for the first time. We show that state-of-the-art OKE systems vary in the transparency of the linked data generated and that these differences can be automatically quantified leading to potential applications in trustworthy AI, compliance, data protection, data governance, and future OKE system design and testing.

PMID:37653549 | DOI:10.1186/s13326-023-00293-9

Categories: Literature Watch

Using linked data to identify pathways of reporting overdose events in British Columbia, 2015-2017

Thu, 2023-08-31 06:00

Int J Popul Data Sci. 2022 Oct 26;7(1):1708. doi: 10.23889/ijpds.v7i1.1708. eCollection 2022.

ABSTRACT

INTRODUCTION: Overdose events related to illicit opioids and other substances are a public health crisis in Canada. The BC Provincial Overdose Cohort is a collection of linked datasets identifying drug-related toxicity events, including death, ambulance, emergency room, hospital, and physician records. The datasets were brought together to understand factors associated with drug-related overdose and can also provide information on pathways of care among people who experience an overdose.

OBJECTIVES: To describe pathways of recorded healthcare use for overdose events in British Columbia, Canada and discrepancies between data sources.

METHODS: Using the BC Provincial Overdose Cohort spanning 2015 to 2017, we examined pathways of recorded health care use for overdose through the framework of an injury reporting pyramid. We also explored differences in event capture between linked datasets.

RESULTS: In the cohort, a total of 34,113 fatal and non-fatal overdose events were identified. A total of 3,056 people died of overdose. Nearly 80% of these deaths occurred among those with no contact with the healthcare system. The majority of events with healthcare records included contact with EHS services (72%), while 39% were seen in the ED and only 7% were hospitalized. Pathways of care from EHS services to ED and hospitalization were generally observed. However, not all ED visits had an associated EHS record and some hospitalizations following an ED visit were for other health issues.

CONCLUSIONS: These findings emphasize the importance of accessing timely healthcare for people experiencing overdose. These findings can be applied to understanding pathways of care for people who experience overdose events and estimating the total burden of healthcare-attended overdose events.

HIGHLIGHTS: In British Columbia, Canada:Multiple sources of linked administrative health data were leveraged to understand recorded healthcare use among people with fatal and non-fatal overdose eventsThe majority of fatal overdose events occurred with no contact with the healthcare system and only appear in mortality dataMany non-fatal overdose events were captured in data from emergency health services, emergency departments, and hospital recordsAccessing timely healthcare services is critical for people experiencing overdose.

PMID:37650030 | PMC:PMC10464869 | DOI:10.23889/ijpds.v7i1.1708

Categories: Literature Watch

Using data linkage to monitor COVID-19 vaccination: development of a vaccination linked data repository

Thu, 2023-08-31 06:00

Int J Popul Data Sci. 2022 Dec 15;5(4):1730. doi: 10.23889/ijpds.v5i4.1730. eCollection 2020.

ABSTRACT

The COVID-19 Vaccination Linked Data Repository (CVLDR) was established in 2021 to assist with the implementation and management of the COVID-19 vaccination program in the State of Western Australia (WA). The CVLDR contains a number of datasets including the Australian Immunisation Register, hospital admissions, emergency department attendances, notifiable infectious disease, and laboratory data. Datasets in the CVLDR are linked using a probabilistic method at the WA Department of Health. Quality assurance mechanisms have been established to identify and mitigate potential errors in the linkage. Each of the datasets has varying degrees of data quality and completeness, however most are of high standard, underpinned by legislation. The linking of the datasets within the CVLDR has allowed for increased public health utility in the immunisation program including the areas of vaccine safety, effectiveness, and coverage.

PMID:37649990 | PMC:PMC10464866 | DOI:10.23889/ijpds.v5i4.1730

Categories: Literature Watch

PEPhub: a database, web interface, and API for editing, sharing, and validating biological sample metadata

Wed, 2023-08-30 06:00

bioRxiv. 2023 Aug 18:2023.08.15.551388. doi: 10.1101/2023.08.15.551388. Preprint.

ABSTRACT

BACKGROUND: As biological data increases, we need additional infrastructure to share it and promote interoperability. While major effort has been put into sharing data, relatively less emphasis is placed on sharing metadata. Yet, sharing metadata is also important, and in some ways has a wider scope than sharing data itself.

RESULTS: Here, we present PEPhub, an approach to improve sharing and interoperability of biological metadata. PEPhub provides an API, natural language search, and user-friendly web-based sharing and editing of sample metadata tables. We used PEPhub to process more than 100,000 published biological research projects and index them with fast semantic natural language search. PEPhub thus provides a fast and user-friendly way to finding existing biological research data, or to share new data.

AVAILABILITY: https://pephub.databio.org.

PMID:37645717 | PMC:PMC10462087 | DOI:10.1101/2023.08.15.551388

Categories: Literature Watch

A geospatial source selector for federated GeoSPARQL querying

Wed, 2023-08-30 06:00

Open Res Eur. 2022 Oct 6;2:48. doi: 10.12688/openreseurope.14605.2. eCollection 2022.

ABSTRACT

Background: Geospatial linked data brings into the scope of the Semantic Web and its technologies, a wealth of datasets that combine semantically-rich descriptions of resources with their geo-location. There are, however, various Semantic Web technologies where technical work is needed in order to achieve the full integration of geospatial data, and federated query processing is one of these technologies. Methods: In this paper, we explore the idea of annotating data sources with a bounding polygon that summarizes the spatial extent of the resources in each data source, and of using such a summary as an (additional) source selection criterion in order to reduce the set of sources that will be tested as potentially holding relevant data. We present our source selection method, and we discuss its correctness and implementation. Results: We evaluate the proposed source selection using three different types of summaries with different degrees of accuracy, against not using geospatial summaries. We use datasets and queries from a practical use case that combines crop-type data with water availability data for food security. The experimental results suggest that more complex summaries lead to slower source selection times, but also to more precise exclusion of unneeded sources. Moreover, we observe the source selection runtime is (partially or fully) recovered by shorter planning and execution runtimes. As a result, the federated sources are not burdened by pointless querying from the federation engine. Conclusions: The evaluation draws on data and queries from the agroenvironmental domain and shows that our source selection method substantially improves the effectiveness of federated GeoSPARQL query processing.

PMID:37645331 | PMC:PMC10446020 | DOI:10.12688/openreseurope.14605.2

Categories: Literature Watch

Overcoming challenges in rare disease registry integration using the semantic web - a clinical research perspective

Tue, 2023-08-29 06:00

Orphanet J Rare Dis. 2023 Aug 29;18(1):253. doi: 10.1186/s13023-023-02841-z.

ABSTRACT

The growing number of disease-specific patient registries for rare diseases has highlighted the need for registry interoperability and data linkage, leading to large-scale rare disease data integration projects using Semantic Web based solutions. These technologies may be difficult to grasp for rare disease experts, leading to limited involvement by domain expertise in the data integration process. Here, we propose a data integration framework starting from the perspective of the clinical researcher, allowing for purposeful rare disease registry integration driven by clinical research questions.

PMID:37644439 | DOI:10.1186/s13023-023-02841-z

Categories: Literature Watch

Initiatives, Concepts, and Implementation Practices of the Findable, Accessible, Interoperable, and Reusable Data Principles in Health Data Stewardship: Scoping Review

Mon, 2023-08-28 06:00

J Med Internet Res. 2023 Aug 28;25:e45013. doi: 10.2196/45013.

ABSTRACT

BACKGROUND: Thorough data stewardship is a key enabler of comprehensive health research. Processes such as data collection, storage, access, sharing, and analytics require researchers to follow elaborate data management strategies properly and consistently. Studies have shown that findable, accessible, interoperable, and reusable (FAIR) data leads to improved data sharing in different scientific domains.

OBJECTIVE: This scoping review identifies and discusses concepts, approaches, implementation experiences, and lessons learned in FAIR initiatives in health research data.

METHODS: The Arksey and O'Malley stage-based methodological framework for scoping reviews was applied. PubMed, Web of Science, and Google Scholar were searched to access relevant publications. Articles written in English, published between 2014 and 2020, and addressing FAIR concepts or practices in the health domain were included. The 3 data sources were deduplicated using a reference management software. In total, 2 independent authors reviewed the eligibility of each article based on defined inclusion and exclusion criteria. A charting tool was used to extract information from the full-text papers. The results were reported using the PRISMA-ScR (Preferred Reporting Items for Systematic Reviews and Meta-Analyses extension for Scoping Reviews) guidelines.

RESULTS: A total of 2.18% (34/1561) of the screened articles were included in the final review. The authors reported FAIRification approaches, which include interpolation, inclusion of comprehensive data dictionaries, repository design, semantic interoperability, ontologies, data quality, linked data, and requirement gathering for FAIRification tools. Challenges and mitigation strategies associated with FAIRification, such as high setup costs, data politics, technical and administrative issues, privacy concerns, and difficulties encountered in sharing health data despite its sensitive nature were also reported. We found various workflows, tools, and infrastructures designed by different groups worldwide to facilitate the FAIRification of health research data. We also uncovered a wide range of problems and questions that researchers are trying to address by using the different workflows, tools, and infrastructures. Although the concept of FAIR data stewardship in the health research domain is relatively new, almost all continents have been reached by at least one network trying to achieve health data FAIRness. Documented outcomes of FAIRification efforts include peer-reviewed publications, improved data sharing, facilitated data reuse, return on investment, and new treatments. Successful FAIRification of data has informed the management and prognosis of various diseases such as cancer, cardiovascular diseases, and neurological diseases. Efforts to FAIRify data on a wider variety of diseases have been ongoing since the COVID-19 pandemic.

CONCLUSIONS: This work summarises projects, tools, and workflows for the FAIRification of health research data. The comprehensive review shows that implementing the FAIR concept in health data stewardship carries the promise of improved research data management and transparency in the era of big data and open research publishing.

INTERNATIONAL REGISTERED REPORT IDENTIFIER (IRRID): RR2-10.2196/22505.

PMID:37639292 | DOI:10.2196/45013

Categories: Literature Watch

Infrastructure tools to support an effective Radiation Oncology Learning Health System

Fri, 2023-08-25 06:00

J Appl Clin Med Phys. 2023 Aug 25:e14127. doi: 10.1002/acm2.14127. Online ahead of print.

ABSTRACT

PURPOSE: Radiation Oncology Learning Health System (RO-LHS) is a promising approach to improve the quality of care by integrating clinical, dosimetry, treatment delivery, research data in real-time. This paper describes a novel set of tools to support the development of a RO-LHS and the current challenges they can address.

METHODS: We present a knowledge graph-based approach to map radiotherapy data from clinical databases to an ontology-based data repository using FAIR concepts. This strategy ensures that the data are easily discoverable, accessible, and can be used by other clinical decision support systems. It allows for visualization, presentation, and data analyses of valuable information to identify trends and patterns in patient outcomes. We designed a search engine that utilizes ontology-based keyword searching, synonym-based term matching that leverages the hierarchical nature of ontologies to retrieve patient records based on parent and children classes, connects to the Bioportal database for relevant clinical attributes retrieval. To identify similar patients, a method involving text corpus creation and vector embedding models (Word2Vec, Doc2Vec, GloVe, and FastText) are employed, using cosine similarity and distance metrics.

RESULTS: The data pipeline and tool were tested with 1660 patient clinical and dosimetry records resulting in 504 180 RDF (Resource Description Framework) tuples and visualized data relationships using graph-based representations. Patient similarity analysis using embedding models showed that the Word2Vec model had the highest mean cosine similarity, while the GloVe model exhibited more compact embeddings with lower Euclidean and Manhattan distances.

CONCLUSIONS: The framework and tools described support the development of a RO-LHS. By integrating diverse data sources and facilitating data discovery and analysis, they contribute to continuous learning and improvement in patient care. The tools enhance the quality of care by enabling the identification of cohorts, clinical decision support, and the development of clinical studies and machine learning programs in radiation oncology.

PMID:37624227 | DOI:10.1002/acm2.14127

Categories: Literature Watch

Age differences in the neural processing of semantics, within and beyond the core semantic network

Mon, 2023-08-21 06:00

Neurobiol Aging. 2023 Nov;131:88-105. doi: 10.1016/j.neurobiolaging.2023.07.022. Epub 2023 Jul 25.

ABSTRACT

Aging is associated with functional activation changes in domain-specific regions and large-scale brain networks. This preregistered Functional magnetic resonance imaging (fMRI) study investigated these effects within the domain of semantic cognition. Participants completed 1 nonsemantic and 2 semantic tasks. We found no age differences in semantic activation in core semantic regions. However, the right inferior frontal gyrus showed difficulty-related increases in both age groups. This suggests that age-related upregulation of this area may be a compensatory response to increased processing demands. At a network level, older people showed more engagement in the default mode network and less in the executive multiple-demand network, aligning with older people's greater knowledge reserves and executive declines. In contrast, activation was age-invariant in semantic control regions. Finally, older adults showed reduced demand-related modulation of multiple-demand network activation in the nonsemantic task but not the semantic tasks. These findings provide a new perspective on the neural basis of semantic cognition in aging, suggesting that preserved function in specialized semantic networks may help to maintain semantic cognition in later life.

PMID:37603932 | DOI:10.1016/j.neurobiolaging.2023.07.022

Categories: Literature Watch

A Deep Learning Model for the Normalization of Institution Names by Multisource Literature Feature Fusion: Algorithm Development Study

Fri, 2023-08-18 06:00

JMIR Form Res. 2023 Aug 18;7:e47434. doi: 10.2196/47434.

ABSTRACT

BACKGROUND: The normalization of institution names is of great importance for literature retrieval, statistics of academic achievements, and evaluation of the competitiveness of research institutions. Differences in authors' writing habits and spelling mistakes lead to various names of institutions, which affects the analysis of publication data. With the development of deep learning models and the increasing maturity of natural language processing methods, training a deep learning-based institution name normalization model can increase the accuracy of institution name normalization at the semantic level.

OBJECTIVE: This study aimed to train a deep learning-based model for institution name normalization based on the feature fusion of affiliation data from multisource literature, which would realize the normalization of institution name variants with the help of authority files and achieve a high specification accuracy after several rounds of training and optimization.

METHODS: In this study, an institution name normalization-oriented model was trained based on bidirectional encoder representations from transformers (BERT) and other deep learning models, including the institution classification model, institutional hierarchical relation extraction model, and institution matching and merging model. The model was then trained to automatically learn institutional features by pretraining and fine-tuning, and institution names were extracted from the affiliation data of 3 databases to complete the normalization process: Dimensions, Web of Science, and Scopus.

RESULTS: It was found that the trained model could achieve at least 3 functions. First, the model could identify the institution name that is consistent with the authority files and associate the name with the files through the unique institution ID. Second, it could identify the nonstandard institution name variants, such as singular forms, plural changes, and abbreviations, and update the authority files. Third, it could identify the unregistered institutions and add them to the authority files, so that when the institution appeared again, the model could identify and regard it as a registered institution. Moreover, the test results showed that the accuracy of the normalization model reached 93.79%, indicating the promising performance of the model for the normalization of institution names.

CONCLUSIONS: The deep learning-based institution name normalization model trained in this study exhibited high accuracy. Therefore, it could be widely applied in the evaluation of the competitiveness of research institutions, analysis of research fields of institutions, and construction of interinstitutional cooperation networks, among others, showing high application value.

PMID:37594844 | DOI:10.2196/47434

Categories: Literature Watch

Changes in comprehensiveness of services delivered by Canadian family physicians: Analysis of population-based linked data in 4 provinces

Tue, 2023-08-15 06:00

Can Fam Physician. 2023 Aug;69(8):550-556. doi: 10.46747/cfp.6908550.

ABSTRACT

OBJECTIVE: To describe changes in the comprehensiveness of services delivered by family physicians across service settings and service areas in 4 Canadian provinces, to identify which settings and areas have changed the most, and to compare the magnitude of changes by physician characteristics.

DESIGN: Descriptive analysis of province-wide, population-based billing data linked to population and physician registries.

SETTING: British Columbia, Manitoba, Ontario, and Nova Scotia.

PARTICIPANTS: Family physicians registered to practise in the 1999-2000 and 2017-2018 fiscal years.

MAIN OUTCOME MEASURES: Comprehensiveness was measured across 7 service settings (home care, long-term care, emergency departments, hospitals, obstetric care, surgical assistance, anesthesiology) and in 7 service areas consistent with office-based practice (prenatal and postnatal care, Papanicolaou testing, mental health, substance use, cancer care, minor surgery, palliative home visits). The proportion of physicians with activity in each setting and area are reported and the average number of service settings and areas by physician characteristics is described (years in practice, sex, urban or rural practice setting, and location of medical degree training).

RESULTS: Declines in comprehensiveness were observed across all provinces studied. Declines were greater for comprehensiveness of settings than for areas consistent with office-based practice. Changes were observed across all physician characteristics. On average across provinces, declines in the number of service settings and service areas were highest among physicians in practice 20 years or longer, male physicians, and physicians practising in urban areas.

CONCLUSION: Declining comprehensiveness was observed across all physician characteristics, pointing to changes in the practice and policy contexts in which all family physicians work.

PMID:37582603 | DOI:10.46747/cfp.6908550

Categories: Literature Watch

Comprehensive Ontology of Fibroproliferative Diseases: Protocol for a Semantic Technology Study

Fri, 2023-08-11 06:00

JMIR Res Protoc. 2023 Aug 11;12:e48645. doi: 10.2196/48645.

ABSTRACT

BACKGROUND: Fibroproliferative or fibrotic diseases (FDs), which represent a significant proportion of age-related pathologies and account for over 40% of mortality in developed nations, are often underrepresented in focused research. Typically, these conditions are studied individually, such as chronic obstructive pulmonary disease or idiopathic pulmonary fibrosis (IPF), rather than as a collective entity, thereby limiting the holistic understanding and development of effective treatments. To address this, we propose creating and publicizing a comprehensive fibroproliferative disease ontology (FDO) to unify the understanding of FDs.

OBJECTIVE: This paper aims to delineate the study protocol for the creation of the FDO, foster transparency and high quality standards during its development, and subsequently promote its use once it becomes publicly available.

METHODS: We aim to establish an ontology encapsulating the broad spectrum of FDs, constructed in the Web Ontology Language format using the Protégé ontology editor, adhering to ontology development life cycle principles. The modeling process will leverage Protégé in accordance with a methodologically defined process, involving targeted scoping reviews of MEDLINE and PubMed information, expert knowledge, and an ontology development process. A hybrid top-down and bottom-up strategy will guide the identification of core concepts and relations, conducted by a team of domain experts based on systematic iterations of scientific literature reviews.

RESULTS: The result will be an exhaustive FDO accommodating a wide variety of crucial biomedical concepts, augmented with synonyms, definitions, and references. The FDO aims to encapsulate diverse perspectives on the FD domain, including those of clinicians, health informaticians, medical researchers, and public health experts.

CONCLUSIONS: The FDO is expected to stimulate broader and more in-depth FD research by enabling reasoning, inference, and the identification of relationships between concepts for application in multiple contexts, such as developing specialized software, fostering research communities, and enhancing domain comprehension. A common vocabulary and understanding of relationships among medical professionals could potentially expedite scientific progress and the discovery of innovative solutions. The publicly available FDO will form the foundation for future research, technological advancements, and public health initiatives.

INTERNATIONAL REGISTERED REPORT IDENTIFIER (IRRID): PRR1-10.2196/48645.

PMID:37566458 | DOI:10.2196/48645

Categories: Literature Watch

Web content topic modeling using LDA and HTML tags

Mon, 2023-08-07 06:00

PeerJ Comput Sci. 2023 Jul 11;9:e1459. doi: 10.7717/peerj-cs.1459. eCollection 2023.

ABSTRACT

An immense volume of digital documents exists online and offline with content that can offer useful information and insights. Utilizing topic modeling enhances the analysis and understanding of digital documents. Topic modeling discovers latent semantic structures or topics within a set of digital textual documents. The Internet of Things, Blockchain, recommender system, and search engine optimization applications use topic modeling to handle data mining tasks, such as classification and clustering. The usefulness of topic models depends on the quality of resulting term patterns and topics with high quality. Topic coherence is the standard metric to measure the quality of topic models. Previous studies build topic models to generally work on conventional documents, and they are insufficient and underperform when applied to web content data due to differences in the structure of the conventional and HTML documents. Neglecting the unique structure of web content leads to missing otherwise coherent topics and, therefore, low topic quality. This study aims to propose an innovative topic model to learn coherence topics in web content data. We present the HTML Topic Model (HTM), a web content topic model that takes into consideration the HTML tags to understand the structure of web pages. We conducted two series of experiments to demonstrate the limitations of the existing topic models and examine the topic coherence of the HTM against the widely used Latent Dirichlet Allocation (LDA) model and its variants, namely the Correlated Topic Model, the Dirichlet Multinomial Regression, the Hierarchical Dirichlet Process, the Hierarchical Latent Dirichlet Allocation, the pseudo-document based Topic Model, and the Supervised Latent Dirichlet Allocation models. The first experiment demonstrates the limitations of the existing topic models when applied to web content data and, therefore, the essential need for a web content topic model. When applied to web data, the overall performance dropped an average of five times and, in some cases, up to approximately 20 times lower than when applied to conventional data. The second experiment then evaluates the effectiveness of the HTM model in discovering topics and term patterns of web content data. The HTM model achieved an overall 35% improvement in topic coherence compared to the LDA.

PMID:37547394 | PMC:PMC10403181 | DOI:10.7717/peerj-cs.1459

Categories: Literature Watch

NIDM-Terms: community-based terminology management for improved neuroimaging dataset descriptions and query

Thu, 2023-08-03 06:00

Front Neuroinform. 2023 Jul 18;17:1174156. doi: 10.3389/fninf.2023.1174156. eCollection 2023.

ABSTRACT

The biomedical research community is motivated to share and reuse data from studies and projects by funding agencies and publishers. Effectively combining and reusing neuroimaging data from publicly available datasets, requires the capability to query across datasets in order to identify cohorts that match both neuroimaging and clinical/behavioral data criteria. Critical barriers to operationalizing such queries include, in part, the broad use of undefined study variables with limited or no annotations that make it difficult to understand the data available without significant interaction with the original authors. Using the Brain Imaging Data Structure (BIDS) to organize neuroimaging data has made querying across studies for specific image types possible at scale. However, in BIDS, beyond file naming and tightly controlled imaging directory structures, there are very few constraints on ancillary variable naming/meaning or experiment-specific metadata. In this work, we present NIDM-Terms, a set of user-friendly terminology management tools and associated software to better manage individual lab terminologies and help with annotating BIDS datasets. Using these tools to annotate BIDS data with a Neuroimaging Data Model (NIDM) semantic web representation, enables queries across datasets to identify cohorts with specific neuroimaging and clinical/behavioral measurements. This manuscript describes the overall informatics structures and demonstrates the use of tools to annotate BIDS datasets to perform integrated cross-cohort queries.

PMID:37533796 | PMC:PMC10392125 | DOI:10.3389/fninf.2023.1174156

Categories: Literature Watch

Virtual and augmented reality in biomedical engineering

Mon, 2023-07-31 06:00

Biomed Eng Online. 2023 Jul 31;22(1):76. doi: 10.1186/s12938-023-01138-3.

ABSTRACT

BACKGROUND: In the future, extended reality technology will be widely used. People will be led to utilize virtual reality (VR) and augmented reality (AR) technologies in their daily lives, hobbies, numerous types of entertainment, and employment. Medical augmented reality has evolved with applications ranging from medical education to picture-guided surgery. Moreover, a bulk of research is focused on clinical applications, with the majority of research devoted to surgery or intervention, followed by rehabilitation and treatment applications. Numerous studies have also looked into the use of augmented reality in medical education and training.

METHODS: Using the databases Semantic Scholar, Web of Science, Scopus, IEEE Xplore, and ScienceDirect, a scoping review was conducted in accordance with the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) criteria. To find other articles, a manual search was also carried out in Google Scholar. This study presents studies carried out over the previous 14 years (from 2009 to 2023) in detail. We classify this area of study into the following categories: (1) AR and VR in surgery, which is presented in the following subsections: subsection A: MR in neurosurgery; subsection B: spine surgery; subsection C: oral and maxillofacial surgery; and subsection D: AR-enhanced human-robot interaction; (2) AR and VR in medical education presented in the following subsections; subsection A: medical training; subsection B: schools and curriculum; subsection C: XR in Biomedicine; (3) AR and VR for rehabilitation presented in the following subsections; subsection A: stroke rehabilitation during COVID-19; subsection B: cancer and VR, and (4) Millimeter-wave and MIMO systems for AR and VR.

RESULTS: In total, 77 publications were selected based on the inclusion criteria. Four distinct AR and/or VR applications groups could be differentiated: AR and VR in surgery (N = 21), VR and AR in Medical Education (N = 30), AR and VR for Rehabilitation (N = 15), and Millimeter-Wave and MIMO Systems for AR and VR (N = 7), where N is number of cited studies. We found that the majority of research is devoted to medical training and education, with surgical or interventional applications coming in second. The research is mostly focused on rehabilitation, therapy, and clinical applications. Moreover, the application of XR in MIMO has been the subject of numerous research.

CONCLUSION: Examples of these diverse fields of applications are displayed in this review as follows: (1) augmented reality and virtual reality in surgery; (2) augmented reality and virtual reality in medical education; (3) augmented reality and virtual reality for rehabilitation; and (4) millimeter-wave and MIMO systems for augmented reality and virtual reality.

PMID:37525193 | DOI:10.1186/s12938-023-01138-3

Categories: Literature Watch

Hospital admission after primary care consultation for community-onset lower urinary tract infection: a cohort study of risks and predictors using linked data

Mon, 2023-07-24 06:00

Br J Gen Pract. 2023 Aug 31;73(734):e694-e701. doi: 10.3399/BJGP.2022.0592. Print 2023 Sep.

ABSTRACT

BACKGROUND: Urinary tract infections (UTIs) are a common indication for antibiotic prescriptions, reductions in which would reduce antimicrobial resistance (AMR). Risk stratification of patients allows reductions to be made safely.

AIM: To identify risk factors for hospital admission following UTI, to inform targeted antibiotic stewardship.

DESIGN AND SETTING: Retrospective cohort study of East London primary care patients.

METHOD: Hospital admission outcomes following primary care consultation for UTI were analysed using linked data from primary care, secondary care, and microbiology, from 1 April 2012 to 31 March 2017. The outcomes analysed were urinary infection-related hospital admission (UHA) and all-cause hospital admission (AHA) within 30 days of UTI in primary care. Odds ratios between specific variables (demographic characteristics, prior antibiotic exposure, and comorbidities) and the outcomes were predicted using generalised estimating equations, and fitted to a final multivariable model including all variables with a P-value <0.1 on univariable analysis.

RESULTS: Of the 169 524 episodes of UTI, UHA occurred in 1336 cases (0.8%, 95% confidence interval [CI] = 0.7 to 0.8) and AHA in 6516 cases (3.8%, 95% CI = 3.8 to 3.9). On multivariable analysis, increased odds of UHA were seen in patients aged 55-74 years (adjusted odds ratio [AOR] 1.49) and ≥75 years (AOR 3.24), relative to adults aged 16-34 years. Increased odds of UHA were also associated with chronic kidney disease (CKD; AOR 1.55), urinary catheters (AOR 2.01), prior antibiotics (AOR 1.38 for ≥3 courses), recurrent UTI (AOR 1.33), faecal incontinence (FI; AOR 1.47), and diabetes mellitus (DM; AOR 1.37).

CONCLUSION: Urinary infection-related hospital admission after primary care consultation for community-onset lower UTI was rare; however, increased odds for UHA were observed for some patient groups. Efforts to reduce antibiotic prescribing for suspected UTI should focus on patients aged <55 years without risk factors for complicated UTI, recurrent UTI, DM, or FI.

PMID:37487642 | PMC:PMC10394611 | DOI:10.3399/BJGP.2022.0592

Categories: Literature Watch

Semantically enabling clinical decision support recommendations

Tue, 2023-07-18 06:00

J Biomed Semantics. 2023 Jul 18;14(1):8. doi: 10.1186/s13326-023-00285-9.

ABSTRACT

BACKGROUND: Clinical decision support systems have been widely deployed to guide healthcare decisions on patient diagnosis, treatment choices, and patient management through evidence-based recommendations. These recommendations are typically derived from clinical practice guidelines created by clinical specialties or healthcare organizations. Although there have been many different technical approaches to encoding guideline recommendations into decision support systems, much of the previous work has not focused on enabling system generated recommendations through the formalization of changes in a guideline, the provenance of a recommendation, and applicability of the evidence. Prior work indicates that healthcare providers may not find that guideline-derived recommendations always meet their needs for reasons such as lack of relevance, transparency, time pressure, and applicability to their clinical practice.

RESULTS: We introduce several semantic techniques that model diseases based on clinical practice guidelines, provenance of the guidelines, and the study cohorts they are based on to enhance the capabilities of clinical decision support systems. We have explored ways to enable clinical decision support systems with semantic technologies that can represent and link to details in related items from the scientific literature and quickly adapt to changing information from the guidelines, identifying gaps, and supporting personalized explanations. Previous semantics-driven clinical decision systems have limited support in all these aspects, and we present the ontologies and semantic web based software tools in three distinct areas that are unified using a standard set of ontologies and a custom-built knowledge graph framework: (i) guideline modeling to characterize diseases, (ii) guideline provenance to attach evidence to treatment decisions from authoritative sources, and (iii) study cohort modeling to identify relevant research publications for complicated patients.

CONCLUSIONS: We have enhanced existing, evidence-based knowledge by developing ontologies and software that enables clinicians to conveniently access updates to and provenance of guidelines, as well as gather additional information from research studies applicable to their patients' unique circumstances. Our software solutions leverage many well-used existing biomedical ontologies and build upon decades of knowledge representation and reasoning work, leading to explainable results.

PMID:37464259 | DOI:10.1186/s13326-023-00285-9

Categories: Literature Watch

Anil Jegga

Semantic Web

Lessons learned from using linked administrative data to evaluate the Family Nurse Partnership in England and Scotland

Development of an integrated and inferenceable RDF database of glycan, pathogen and disease resources

PO2/TransformON, an ontology for data integration on food, feed, bioproducts and biowaste engineering

Automatic transparency evaluation for open knowledge extraction systems

Using linked data to identify pathways of reporting overdose events in British Columbia, 2015-2017

Using data linkage to monitor COVID-19 vaccination: development of a vaccination linked data repository

PEPhub: a database, web interface, and API for editing, sharing, and validating biological sample metadata

A geospatial source selector for federated GeoSPARQL querying

Overcoming challenges in rare disease registry integration using the semantic web - a clinical research perspective

Initiatives, Concepts, and Implementation Practices of the Findable, Accessible, Interoperable, and Reusable Data Principles in Health Data Stewardship: Scoping Review

Infrastructure tools to support an effective Radiation Oncology Learning Health System

Age differences in the neural processing of semantics, within and beyond the core semantic network

A Deep Learning Model for the Normalization of Institution Names by Multisource Literature Feature Fusion: Algorithm Development Study

Changes in comprehensiveness of services delivered by Canadian family physicians: Analysis of population-based linked data in 4 provinces

Comprehensive Ontology of Fibroproliferative Diseases: Protocol for a Semantic Technology Study

Web content topic modeling using LDA and HTML tags

NIDM-Terms: community-based terminology management for improved neuroimaging dataset descriptions and query

Virtual and augmented reality in biomedical engineering

Hospital admission after primary care consultation for community-onset lower urinary tract infection: a cohort study of risks and predictors using linked data

Semantically enabling clinical decision support recommendations

Pages