Semantic Web

ARCH: Large-scale Knowledge Graph via Aggregated Narrative Codified Health Records Analysis

Fri, 2023-06-09 06:00

medRxiv. 2023 May 21:2023.05.14.23289955. doi: 10.1101/2023.05.14.23289955. Preprint.

ABSTRACT

OBJECTIVE: Electronic health record (EHR) systems contain a wealth of clinical data stored as both codified data and free-text narrative notes, covering hundreds of thousands of clinical concepts available for research and clinical care. The complex, massive, heterogeneous, and noisy nature of EHR data imposes significant challenges for feature representation, information extraction, and uncertainty quantification. To address these challenges, we proposed an efficient A ggregated na R rative C odified H ealth (ARCH) records analysis to generate a large-scale knowledge graph (KG) for a comprehensive set of EHR codified and narrative features.

METHODS: The ARCH algorithm first derives embedding vectors from a co-occurrence matrix of all EHR concepts and then generates cosine similarities along with associated p -values to measure the strength of relatedness between clinical features with statistical certainty quantification. In the final step, ARCH performs a sparse embedding regression to remove indirect linkage between entity pairs. We validated the clinical utility of the ARCH knowledge graph, generated from 12.5 million patients in the Veterans Affairs (VA) healthcare system, through downstream tasks including detecting known relationships between entity pairs, predicting drug side effects, disease phenotyping, as well as sub-typing Alzheimer's disease patients.

RESULTS: ARCH produces high-quality clinical embeddings and KG for over 60, 000 EHR concepts, as visualized in the R-shiny powered web-API ( https://celehs.hms.harvard.edu/ARCH/ ). The ARCH embeddings attained an average area under the ROC curve (AUC) of 0.926 and 0.861 for detecting pairs of similar EHR concepts when the concepts are mapped to codified data and to NLP data; and 0.810 (codified) and 0.843 (NLP) for detecting related pairs. Based on the p -values computed by ARCH, the sensitivity of detecting similar and related entity pairs are 0.906 and 0.888 under false discovery rate (FDR) control of 5%. For detecting drug side effects, the cosine similarity based on the ARCH semantic representations achieved an AUC of 0.723 while the AUC improved to 0.826 after few-shot training via minimizing the loss function on the training data set. Incorporating NLP data substantially improved the ability to detect side effects in the EHR. For example, based on unsupervised ARCH embeddings, the power of detecting drug-side effects pairs when using codified data only was 0.15, much lower than the power of 0.51 when using both codified and NLP concepts. Compared to existing large-scale representation learning methods including PubmedBERT, BioBERT and SAPBERT, ARCH attains the most robust performance and substantially higher accuracy in detecting these relationships. Incorporating ARCH selected features in weakly supervised phenotyping algorithms can improve the robustness of algorithm performance, especially for diseases that benefit from NLP features as supporting evidence. For example, the phenotyping algorithm for depression attained an AUC of 0.927 when using ARCH selected features but only 0.857 when using codified features selected via the KESER network[1]. In addition, embeddings and knowledge graphs generated from the ARCH network were able to cluster AD patients into two subgroups, where the fast progression subgroup had a much higher mortality rate.

CONCLUSIONS: The proposed ARCH algorithm generates large-scale high-quality semantic representations and knowledge graph for both codified and NLP EHR features, useful for a wide range of predictive modeling tasks.

PMID:37293026 | PMC:PMC10246054 | DOI:10.1101/2023.05.14.23289955

Categories: Literature Watch

HighAltitudeOmicsDB, an integrated resource for high-altitude associated genes and proteins, networks and semantic-similarities

Thu, 2023-06-08 06:00

Sci Rep. 2023 Jun 8;13(1):9307. doi: 10.1038/s41598-023-35792-3.

ABSTRACT

Millions of people worldwide visit, live or work in the hypoxic environment encountered at high altitudes and it is important to understand the biomolecular responses to this stress. This would help design mitigation strategies for high altitude illnesses. In spite of a number of studies spanning over 100 years, still the complex mechanisms controlling acclimatization to hypoxia remain largely unknown. To identify potential diagnostic, therapeutic and predictive markers for HA stress, it is important to comprehensively compare and analyse these studies. Towards this goal, HighAltitudeOmicsDB is a unique resource that provides a comprehensive, curated, user-friendly and detailed compilation of various genes/proteins which have been experimentally validated to be associated with various HA conditions, their protein-protein interactions (PPIs) and gene ontology (GO) semantic similarities. For each database entry, HighAltitudeOmicsDB additionally stores the level of regulation (up/down-regulation), fold change, study control group, duration and altitude of exposure, tissue of expression, source organism, level of hypoxia, method of experimental validation, place/country of study, ethnicity, geographical location etc. The database also collates information on disease and drug association, tissue-specific expression level, GO and KEGG pathway associations. The web resource is a unique server platform that offers interactive PPI networks and GO semantic similarity matrices among the interactors.These unique features help to offer mechanistic insights into the disease pathology. Hence, HighAltitudeOmicsDBis a unique platform for researchers working in this area to explore, fetch, compare and analyse HA-associated genes/proteins, their PPI networks, and GO semantic similarities. The database is available at http://www.altitudeomicsdb.in .

PMID:37291174 | DOI:10.1038/s41598-023-35792-3

Categories: Literature Watch

A web framework for information aggregation and management of multilingual hate speech

Mon, 2023-05-22 06:00

Heliyon. 2023 May 9;9(5):e16084. doi: 10.1016/j.heliyon.2023.e16084. eCollection 2023 May.

ABSTRACT

Social media platforms have led to the creation of a vast amount of information produced by users and published publicly, facilitating participation in the public sphere, but also giving the opportunity for certain users to publish hateful content. This content mainly involves offensive/discriminative speech towards social groups or individuals (based on racial, religious, gender or other characteristics) and could possibly lead into subsequent hate actions/crimes due to persistent escalation. Content management and moderation in big data volumes can no longer be supported manually. In the current research, a web framework is presented and evaluated for the collection, analysis, and aggregation of multilingual textual content from various online sources. The framework is designed to address the needs of human users, journalists, academics, and the public to collect and analyze content from social media and the web in Spanish, Italian, Greek, and English, without prior training or a background in Computer Science. The backend functionality provides content collection and monitoring, semantic analysis including hate speech detection and sentiment analysis using machine learning models and rule-based algorithms, storing, querying, and retrieving such content along with the relevant metadata in a database. This functionality is assessed through a graphic user interface that is accessed using a web browser. An evaluation procedure was held through online questionnaires, including journalists and students, proving the feasibility of the use of the proposed framework by non-experts for the defined use-case scenarios.

PMID:37215824 | PMC:PMC10196859 | DOI:10.1016/j.heliyon.2023.e16084

Categories: Literature Watch

Slowdowns in scalar implicature processing: Isolating the intention-reading costs in the Bott & Noveck task

Sun, 2023-05-21 06:00

Cognition. 2023 May 19;238:105480. doi: 10.1016/j.cognition.2023.105480. Online ahead of print.

ABSTRACT

An underinformative sentence, such as Some cats are mammals, is trivially true with a semantic (some and perhaps all) reading of the quantifier and false with a pragmatic (some but not all) one, with the latter reliably resulting in longer response times than the former in a truth evaluation task (Bott & Noveck, 2004). Most analyses attribute these prolonged reaction times, or costs, to the steps associated with the derivation of the scalar implicature. In the present work we investigate, across three experiments, whether such slowdowns can be attributed (at least partly) to the participant's need to adjust to the speaker's informative intention. In Experiment 1, we designed a web-based version of Bott & Noveck's (2004) laboratory task that would most reliably provide its classic results. In Experiment 2 we found that over the course of an experimental session, participants' pragmatic responses to underinformative sentences are initially reliably long and ultimately comparable to response times of logical interpretations to the same sentences. Such results cannot readily be explained by assuming that implicature derivation is a consistent source of processing effort. In Experiment 3, we further tested our account by examining how response times change as a function of the number of people said to produce the critical utterances. When participants are introduced (via a photo and description) to a single 'speaker', the results are similar to those found in Experiment 2. However, when they are introduced to two 'speakers', with the second 'speaker' appearing midway (after five encounters with underinformative items), we found a significant uptick in pragmatic response latencies to the underinformative item right after participants' meet their second speaker (i.e. at their sixth encounter with an underinformative item). Overall, we interpret these results as suggesting that at least part of the cost typically attributed to the derivation of a scalar implicature is actually a consequence of how participants think about the informative intentions of the person producing the underinformative sentences.

PMID:37210877 | DOI:10.1016/j.cognition.2023.105480

Categories: Literature Watch

Disaster management ontology- an ontological approach to disaster management automation

Fri, 2023-05-19 06:00

Sci Rep. 2023 May 19;13(1):8091. doi: 10.1038/s41598-023-34874-6.

ABSTRACT

The geographical location of any region, as well as large-scale environmental changes caused by a variety of factors, invite a wide range of disasters. Floods, droughts, earthquakes, cyclones, landslides, tornadoes, and cloudbursts are all common natural disasters that destroy property and kill people. On average, 0.1% of the total deaths globally in the past decade have been due to natural disasters. The National Disaster Management Authority (NDMA), a branch of the Ministry of Home Affairs, plays an important role in disaster management in India by taking responsibility for risk mitigation, response, and recovery from all natural and man-made disasters. This article presents an ontology-based disaster management framework based on the NDMA's responsibility matrix. This ontological base framework is named as Disaster Management Ontology (DMO). It aids in task distribution among necessary authorities at various stages of a disaster, as well as a knowledge-driven decision support system for financial assistance to victims. In the proposed DMO, ontology has been used to integrate knowledge as well as a working platform for reasoners, and the Decision Support System (DSS) ruleset is written in Semantic Web Rule Language (SWRL), which is based on the First Order Logic (FOL) concept. In addition, OntoGraph, a class view of taxonomy, is used to make taxonomy more interactive for users.

PMID:37208434 | DOI:10.1038/s41598-023-34874-6

Categories: Literature Watch

The PrescIT Knowledge Graph: Supporting ePrescription to Prevent Adverse Drug Reactions

Fri, 2023-05-19 06:00

Stud Health Technol Inform. 2023 May 18;302:551-555. doi: 10.3233/SHTI230203.

ABSTRACT

Adverse Drug Reactions (ADRs) are an important public health issue as they can impose significant health and monetary burdens. This paper presents the engineering and use case of a Knowledge Graph, supporting the prevention of ADRs as part of a Clinical Decision Support System (CDSS) developed in the context of the PrescIT project. The presented PrescIT Knowledge Graph is built upon Semantic Web technologies namely the Resource Description Framework (RDF), and integrates widely relevant data sources and ontologies, i.e., DrugBank, SemMedDB, OpenPVSignal Knowledge Graph and DINTO, resulting in a lightweight and self-contained data source for evidence-based ADRs identification.

PMID:37203746 | DOI:10.3233/SHTI230203

Categories: Literature Watch

An Annotation Workbench for Semantic Annotation of Data Collection Instruments

Fri, 2023-05-19 06:00

Stud Health Technol Inform. 2023 May 18;302:108-112. doi: 10.3233/SHTI230074.

ABSTRACT

Semantic interoperability, i.e., the ability to automatically interpret the shared information in a meaningful way, is one of the most important requirements for data analysis of different sources. In the area of clinical and epidemiological studies, the target of the National Research Data Infrastructure for Personal Health Data (NFDI4Health), interoperability of data collection instruments such as case report forms (CRFs), data dictionaries and questionnaires is critical. Retrospective integration of semantic codes into study metadata at item-level is important, as ongoing or completed studies contain valuable information, which should be preserved. We present a first version of a Metadata Annotation Workbench to support annotators in dealing with a variety of complex terminologies and ontologies. User-driven development with users from the fields of nutritional epidemiology and chronic diseases ensured that the service fulfills the basic requirements for a semantic metadata annotation software for these NFDI4Health use cases. The web application can be accessed using a web browser and the source code of the software is available with an open-source MIT license.

PMID:37203619 | DOI:10.3233/SHTI230074

Categories: Literature Watch

A Survey on Non-Autoregressive Generation for Neural Machine Translation and Beyond

Thu, 2023-05-18 06:00

IEEE Trans Pattern Anal Mach Intell. 2023 May 18;PP. doi: 10.1109/TPAMI.2023.3277122. Online ahead of print.

ABSTRACT

Non-autoregressive (NAR) generation, which is first proposed in neural machine translation (NMT) to speed up inference, has attracted much attention in both machine learning and natural language processing communities. While NAR generation can significantly accelerate inference speed for machine translation, the speedup comes at the cost of sacrificed translation accuracy compared to its counterpart, autoregressive (AR) generation. In recent years, many new models and algorithms have been designed/proposed to bridge the accuracy gap between NAR generation and AR generation. In this paper, we conduct a systematic survey with comparisons and discussions of various non-autoregressive translation (NAT) models from different aspects. Specifically, we categorize the efforts of NAT into several groups, including data manipulation, modeling methods, training criterion, decoding algorithms, and the benefit from pre-trained models. Furthermore, we briefly review other applications of NAR models beyond machine translation, such as grammatical error correction, text summarization, text style transfer, dialogue, semantic parsing, automatic speech recognition, and so on. In addition, we also discuss potential directions for future exploration, including releasing the dependency of KD, reasonable training objectives, pre-training for NAR, and wider applications, etc. We hope this survey can help researchers capture the latest progress in NAR generation, inspire the design of advanced NAR models and algorithms, and enable industry practitioners to choose appropriate solutions for their applications. The web page of this survey is at https://github.com/LitterBrother-Xiao/Overview-of-Non-autoregressive-Applications.

PMID:37200120 | DOI:10.1109/TPAMI.2023.3277122

Categories: Literature Watch

Integrating collective know-how for multicriteria decision support in agrifood chains-application to cheesemaking

Mon, 2023-05-15 06:00

Front Artif Intell. 2023 Apr 28;6:1145007. doi: 10.3389/frai.2023.1145007. eCollection 2023.

ABSTRACT

Agrifood chain processes are based on a multitude of knowledge, know-how and experiences forged over time. This collective expertise must be shared to improve food quality. Here we test the hypothesis that it is possible to design and implement a comprehensive methodology to create a knowledge base integrating collective expertise, while also using it to recommend technical actions required to improve food quality. The method used to test this hypothesis consists firstly in listing the functional specifications that were defined in collaboration with several partners (technical centers, vocational training schools, producers) over the course of several projects carried out in recent years. Secondly, we propose an innovative core ontology that utilizes the international languages of the Semantic Web to effectively represent knowledge in the form of decision trees. These decision trees will depict potential causal relationships between situations of interest and provide recommendations for managing them through technological actions, as well as a collective assessment of the efficiency of those actions. We show how mind map files created using mind-mapping tools are automatically translated into an RDF knowledge base using the core ontological model. Thirdly, a model to aggregate individual assessments provided by technicians and associated with technical action recommendations is proposed and evaluated. Finally, a multicriteria decision-support system (MCDSS) using the knowledge base is presented. It consists of an explanatory view allowing navigation in a decision tree and an action view for multicriteria filtering and possible side effect identification. The different types of MCDSS-delivered answers to a query expressed in the action view are explained. The MCDSS graphical user interface is presented through a real-use case. Experimental assessments have been performed and confirm that tested hypothesis is relevant.

PMID:37187891 | PMC:PMC10175634 | DOI:10.3389/frai.2023.1145007

Categories: Literature Watch

Usability Testing of a Multi-Level Modeling Framework for Just-in-Time Adaptive Interventions (JITAIs) in Mobile Health

Fri, 2023-05-12 06:00

Stud Health Technol Inform. 2023 May 2;301:121-122. doi: 10.3233/SHTI230023.

ABSTRACT

The JITAI is an intervention design to support health behavior change. We designed a multi-level modeling framework for JITAIs and developed a proof-of-concept prototype (POC). This study aimed at investigating the usability of the POC by conducting two usability tests with students. We assessed the usability and the students' workload and success in completing tasks. In the second usability test, however, they faced difficulties in completing the tasks. We will work on hiding the complexity of the framework as well as improving the frontend and the instructions.

PMID:37172164 | DOI:10.3233/SHTI230023

Categories: Literature Watch

REDbox: a comprehensive semantic framework for data collection and management in tuberculosis research

Thu, 2023-05-11 06:00

Sci Rep. 2023 May 11;13(1):7686. doi: 10.1038/s41598-023-33492-6.

ABSTRACT

Clinical research outcomes depend on the correct definition of the research protocol, the data collection strategy, and the data management plan. Furthermore, researchers often need to work within challenging contexts, as is the case in tuberculosis services, where human and technological resources for research may be scarce. Electronic Data Capture Systems mitigate such risks and enable a reliable environment to conduct health research and promote result dissemination and data reusability. The proposed solution is based on needs pinpointed by researchers, considering the need for an accommodating solution to conduct research in low-resource environments. The REDbox framework was developed to facilitate data collection, management, sharing, and availability in tuberculosis research and improve the user experience through user-friendly, web-based tools. REDbox combines elements of the REDCap and KoBoToolbox electronic data capture systems and semantics to deliver new valuable tools that meet the needs of tuberculosis researchers in Brazil. The framework was implemented in five cross-institutional, nationwide projects to evaluate the users' perceptions of the system's usefulness and the information and user experience. Seventeen responses (representing 40% of active users) to an anonymous survey distributed to active users indicated that REDbox was perceived to be helpful for the particular audience of researchers and health professionals. The relevance of this article lies in the innovative approach to supporting tuberculosis research by combining existing technologies and tailoring supporting features.

PMID:37169802 | DOI:10.1038/s41598-023-33492-6

Categories: Literature Watch

Automated approach for quality assessment of RDF resources

Wed, 2023-05-10 06:00

BMC Med Inform Decis Mak. 2023 May 10;23(Suppl 1):90. doi: 10.1186/s12911-023-02182-8.

ABSTRACT

INTRODUCTION: The Semantic Web community provides a common Resource Description Framework (RDF) that allows representation of resources such that they can be linked. To maximize the potential of linked data - machine-actionable interlinked resources on the Web - a certain level of quality of RDF resources should be established, particularly in the biomedical domain in which concepts are complex and high-quality biomedical ontologies are in high demand. However, it is unclear which quality metrics for RDF resources exist that can be automated, which is required given the multitude of RDF resources. Therefore, we aim to determine these metrics and demonstrate an automated approach to assess such metrics of RDF resources.

METHODS: An initial set of metrics are identified through literature, standards, and existing tooling. Of these, metrics are selected that fulfil these criteria: (1) objective; (2) automatable; and (3) foundational. Selected metrics are represented in RDF and semantically aligned to existing standards. These metrics are then implemented in an open-source tool. To demonstrate the tool, eight commonly used RDF resources were assessed, including data models in the healthcare domain (HL7 RIM, HL7 FHIR, CDISC CDASH), ontologies (DCT, SIO, FOAF, ORDO), and a metadata profile (GRDDL).

RESULTS: Six objective metrics are identified in 3 categories: Resolvability (1), Parsability (1), and Consistency (4), and represented in RDF. The tool demonstrates that these metrics can be automated, and application in the healthcare domain shows non-resolvable URIs (ranging from 0.3% to 97%) among all eight resources and undefined URIs in HL7 RIM, and FHIR. In the tested resources no errors were found for parsability and the other three consistency metrics for correct usage of classes and properties.

CONCLUSION: We extracted six objective and automatable metrics from literature, as the foundational quality requirements of RDF resources to maximize the potential of linked data. Automated tooling to assess resources has shown to be effective to identify quality issues that must be avoided. This approach can be expanded to incorporate more automatable metrics so as to reflect additional quality dimensions with the assessment tool implementing more metrics.

PMID:37165363 | DOI:10.1186/s12911-023-02182-8

Categories: Literature Watch

Artificially-generated consolidations and balanced augmentation increase performance of U-net for lung parenchyma segmentation on MR images

Tue, 2023-05-09 06:00

PLoS One. 2023 May 9;18(5):e0285378. doi: 10.1371/journal.pone.0285378. eCollection 2023.

ABSTRACT

PURPOSE: To improve automated lung segmentation on 2D lung MR images using balanced augmentation and artificially-generated consolidations for training of a convolutional neural network (CNN).

MATERIALS AND METHODS: From 233 healthy volunteers and 100 patients, 1891 coronal MR images were acquired. Of these, 1666 images without consolidations were used to build a binary semantic CNN for lung segmentation and 225 images (187 without consolidations, 38 with consolidations) were used for testing. To increase CNN performance of segmenting lung parenchyma with consolidations, balanced augmentation was performed and artificially-generated consolidations were added to all training images. The proposed CNN (CNNBal/Cons) was compared to two other CNNs: CNNUnbal/NoCons-without balanced augmentation and artificially-generated consolidations and CNNBal/NoCons-with balanced augmentation but without artificially-generated consolidations. Segmentation results were assessed using Sørensen-Dice coefficient (SDC) and Hausdorff distance coefficient.

RESULTS: Regarding the 187 MR test images without consolidations, the mean SDC of CNNUnbal/NoCons (92.1 ± 6% (mean ± standard deviation)) was significantly lower compared to CNNBal/NoCons (94.0 ± 5.3%, P = 0.0013) and CNNBal/Cons (94.3 ± 4.1%, P = 0.0001). No significant difference was found between SDC of CNNBal/Cons and CNNBal/NoCons (P = 0.54). For the 38 MR test images with consolidations, SDC of CNNUnbal/NoCons (89.0 ± 7.1%) was not significantly different compared to CNNBal/NoCons (90.2 ± 9.4%, P = 0.53). SDC of CNNBal/Cons (94.3 ± 3.7%) was significantly higher compared to CNNBal/NoCons (P = 0.0146) and CNNUnbal/NoCons (P = 0.001).

CONCLUSIONS: Expanding training datasets via balanced augmentation and artificially-generated consolidations improved the accuracy of CNNBal/Cons, especially in datasets with parenchymal consolidations. This is an important step towards a robust automated postprocessing of lung MRI datasets in clinical routine.

PMID:37159468 | PMC:PMC10168553 | DOI:10.1371/journal.pone.0285378

Categories: Literature Watch

Hybrid Contextual Semantic Network for Accurate Segmentation and Detection of Small-Size Stroke Lesions From MRI

Mon, 2023-05-08 06:00

IEEE J Biomed Health Inform. 2023 Aug;27(8):4062-4073. doi: 10.1109/JBHI.2023.3273771. Epub 2023 Aug 7.

ABSTRACT

Stroke is a cerebrovascular disease with high mortality and disability rates. The occurrence of the stroke typically produces lesions of different sizes, with the accurate segmentation and detection of small-size stroke lesions being closely related to the prognosis of patients. However, the large lesions are usually correctly identified, the small-size lesions are usually ignored. This article provides a hybrid contextual semantic network (HCSNet) that can accurately and simultaneously segment and detect small-size stroke lesions from magnetic resonance images. HCSNet inherits the advantages of the encoder-decoder architecture and applies a novel hybrid contextual semantic module that generates high-quality contextual semantic features from the spatial and channel contextual semantic features through the skip connection layer. Moreover, a mixing-loss function is proposed to optimize HCSNet for unbalanced small-size lesions. HCSNet is trained and evaluated on 2D magnetic resonance images produced from the Anatomical Tracings of Lesions After Stroke challenge (ATLAS R2.0). Extensive experiments demonstrate that HCSNet outperforms several other state-of-the-art methods in its ability to segment and detect small-size stroke lesions. Visualization and ablation experiments reveal that the hybrid semantic module improves the segmentation and detection performance of HCSNet.

PMID:37155390 | DOI:10.1109/JBHI.2023.3273771

Categories: Literature Watch

The involvement of the semantic neural network in rule identification of mathematical processing

Sat, 2023-05-06 06:00

Cortex. 2023 Jul;164:11-20. doi: 10.1016/j.cortex.2023.03.010. Epub 2023 Apr 18.

ABSTRACT

The role of the visuospatial network in mathematical processing has been established, but the involvement of the semantic network in mathematical processing is still poorly understood. The current study utilized a number series completion paradigm with the event-related potential (ERP) technique to examine whether the semantic network supports mathematical processing and to find the corresponding spatiotemporal neural marker. In total, 32 right-handed undergraduate students were recruited and asked to complete the number series completion as well as the arithmetical computation task in which numbers were presented in sequence. The event-related potential and multi-voxel pattern analysis showed that the rule identification process involves more semantic processing when compared with the arithmetical computation processes, and it elicited higher amplitudes for the late negative component (LNC) in left frontal and temporal lobes. These results demonstrated that the semantic network supports the rule identification in mathematical processing, with the LNC acting as the neural marker.

PMID:37148824 | DOI:10.1016/j.cortex.2023.03.010

Categories: Literature Watch

Changing word meanings in biomedical literature reveal pandemics and new technologies

Fri, 2023-05-05 06:00

BioData Min. 2023 May 5;16(1):16. doi: 10.1186/s13040-023-00332-2.

ABSTRACT

While we often think of words as having a fixed meaning that we use to describe a changing world, words are also dynamic and changing. Scientific research can also be remarkably fast-moving, with new concepts or approaches rapidly gaining mind share. We examined scientific writing, both preprint and pre-publication peer-reviewed text, to identify terms that have changed and examine their use. One particular challenge that we faced was that the shift from closed to open access publishing meant that the size of available corpora changed by over an order of magnitude in the last two decades. We developed an approach to evaluate semantic shift by accounting for both intra- and inter-year variability using multiple integrated models. This analysis revealed thousands of change points in both corpora, including for terms such as 'cas9', 'pandemic', and 'sars'. We found that the consistent change-points between pre-publication peer-reviewed and preprinted text are largely related to the COVID-19 pandemic. We also created a web app for exploration that allows users to investigate individual terms ( https://greenelab.github.io/word-lapse/ ). To our knowledge, our research is the first to examine semantic shift in biomedical preprints and pre-publication peer-reviewed text, and provides a foundation for future work to understand how terms acquire new meanings and how peer review affects this process.

PMID:37147665 | DOI:10.1186/s13040-023-00332-2

Categories: Literature Watch

The SciQA Scientific Question Answering Benchmark for Scholarly Knowledge

Thu, 2023-05-04 06:00

Sci Rep. 2023 May 4;13(1):7240. doi: 10.1038/s41598-023-33607-z.

ABSTRACT

Knowledge graphs have gained increasing popularity in the last decade in science and technology. However, knowledge graphs are currently relatively simple to moderate semantic structures that are mainly a collection of factual statements. Question answering (QA) benchmarks and systems were so far mainly geared towards encyclopedic knowledge graphs such as DBpedia and Wikidata. We present SciQA a scientific QA benchmark for scholarly knowledge. The benchmark leverages the Open Research Knowledge Graph (ORKG) which includes almost 170,000 resources describing research contributions of almost 15,000 scholarly articles from 709 research fields. Following a bottom-up methodology, we first manually developed a set of 100 complex questions that can be answered using this knowledge graph. Furthermore, we devised eight question templates with which we automatically generated further 2465 questions, that can also be answered with the ORKG. The questions cover a range of research fields and question types and are translated into corresponding SPARQL queries over the ORKG. Based on two preliminary evaluations, we show that the resulting SciQA benchmark represents a challenging task for next-generation QA systems. This task is part of the open competitions at the 22nd International Semantic Web Conference 2023 as the Scholarly Question Answering over Linked Data (QALD) Challenge.

PMID:37142627 | DOI:10.1038/s41598-023-33607-z

Categories: Literature Watch

Electrochemical and Photoelectrochemical Immunosensors for the Detection of Ovarian Cancer Biomarkers

Fri, 2023-04-28 06:00

Sensors (Basel). 2023 Apr 19;23(8):4106. doi: 10.3390/s23084106.

ABSTRACT

Photoelectrochemical (PEC) sensing is an emerging technological innovation for monitoring small substances/molecules in biological or non-biological systems. In particular, there has been a surge of interest in developing PEC devices for determining molecules of clinical significance. This is especially the case for molecules that are markers for serious and deadly medical conditions. The increased interest in PEC sensors to monitor such biomarkers can be attributed to the many apparent advantages of the PEC system, including an enhanced measurable signal, high potential for miniaturization, rapid testing, and low cost, amongst others. The growing number of published research reports on the subject calls for a comprehensive review of the various findings. This article is a review of studies on electrochemical (EC) and PEC sensors for ovarian cancer biomarkers in the last seven years (2016-2022). EC sensors were included because PEC is an improved EC; and a comparison of both systems has, expectedly, been carried out in many studies. Specific attention was given to the different markers of ovarian cancer and the EC/PEC sensing platforms developed for their detection/quantification. Relevant articles were sourced from the following databases: Scopus, PubMed Central, Web of Science, Science Direct, Academic Search Complete, EBSCO, CORE, Directory of open Access Journals (DOAJ), Public Library of Science (PLOS), BioMed Central (BMC), Semantic Scholar, Research Gate, SciELO, Wiley Online Library, Elsevier and SpringerLink.

PMID:37112447 | DOI:10.3390/s23084106

Categories: Literature Watch

Episodic and Semantic Autobiographical Memory in Mild Cognitive Impairment (MCI): A Systematic Review

Fri, 2023-04-28 06:00

J Clin Med. 2023 Apr 13;12(8):2856. doi: 10.3390/jcm12082856.

ABSTRACT

INTRODUCTION: Mild cognitive impairment (MCI) is a syndrome defined as a decline in cognitive performance greater than expected for an individual according to age and education level, not interfering notably with daily life activities. Many studies have focused on the memory domain in the analysis of MCI and more severe cases of dementia. One specific memory system is represented by autobiographical memory (AM), which has been largely studied in Alzheimer's disease and its effect on AM; however, the impairment of AM in moderate forms of decline, such as MCI, is still controversial.

OBJECTIVE: The main aim of this systematic review is to analyze the functioning of autobiographical memory in patients with MCI, considering both the semantic and the episodic components.

MATERIALS: The review process was conducted according to the PRISMA statement. The search was conducted until 20 February 2023 in the following bibliographical databases: PubMed, Web of Science, Scopus, and PsycInfo, and twenty-one articles were included.

RESULTS: The results highlight controversial findings concerning the semantic component of AM since only seven studies have found a worse semantic AM performance in patients with MCI compared to the HC group. The results of impaired episodic AM in individuals with MCI are more consistent than those concerning semantic AM.

CONCLUSIONS: Starting from the evidence of this systematic review, further studies should detect and investigate the cognitive and emotional mechanisms that undermine AM performance, allowing the development of specific interventions targeting these mechanisms.

PMID:37109193 | DOI:10.3390/jcm12082856

Categories: Literature Watch

Changes in general practice use and costs with COVID-19 and telehealth initiatives: analysis of Australian whole-population linked data

Thu, 2023-04-27 06:00

Br J Gen Pract. 2023 Apr 27;73(730):e364-e373. doi: 10.3399/BJGP.2022.0351. Print 2023 May.

ABSTRACT

BACKGROUND: In response to the COVID-19 pandemic, general practice in Australia underwent a rapid transition, including the roll-out of population-wide telehealth, with uncertain impacts on GP use and costs.

AIM: To describe how use and costs of GP services changed in 2020 - following the COVID-19 pandemic and introduction of telehealth - compared with 2019, and how this varied across population subgroups.

DESIGN AND SETTING: Linked-data analysis of whole-population data for Australia.

METHOD: Multi-Agency Data Integration Project data for ∼19 million individuals from the 2016 census were linked to Medicare data for 2019-2020. Regression models were used to compare age- and sex-adjusted GP use and out-of-pocket costs over time, overall, and by sociodemographic characteristics.

RESULTS: Of the population, 85.5% visited a GP in Q2-Q4 2020, compared with 89.5% in the same period of 2019. The mean number of face-to-face GP services per quarter declined, while telehealth services increased; overall use of GP services in Q4 2020 was similar to, or higher than, that of Q4 2019 for most groups. The proportion of total GP services by telehealth stabilised at 23.5% in Q4 2020. However, individuals aged 3-14 years, ≥70 years, and those with limited English proficiency used fewer GP services in 2020 compared with 2019, with a lower proportion by telehealth, compared with the rest of the population. Mean out-of-pocket costs per service were lower across all subgroups in 2020 compared with 2019.

CONCLUSION: The introduction of widespread telehealth maintained the use of GP services during the COVID-19 pandemic and minimised out-of-pocket costs, but not for all population subgroups.

PMID:37105730 | PMC:PMC9975989 | DOI:10.3399/BJGP.2022.0351

Categories: Literature Watch

Pages