Semantic Web
Pre-treatment graph measures of a functional semantic network are associated with naming therapy outcomes in chronic aphasia.
Pre-treatment graph measures of a functional semantic network are associated with naming therapy outcomes in chronic aphasia.
Brain Lang. 2020 08;207:104809
Authors: Johnson JP, Meier EL, Pan Y, Kiran S
Abstract
Naming treatment outcomes in post-stroke aphasia are variable and the factors underlying this variability are incompletely understood. In this study, 26 patients with chronic aphasia completed a semantic judgment fMRI task before receiving up to 12 weeks of naming treatment. Global (i.e., network-wide) and local (i.e., regional) graph theoretic measures of pre-treatment functional connectivity were analyzed to identify differences between patients who responded most and least favorably to treatment (i.e., responders and nonresponders) and determine if network measures predicted naming improvements. Responders had higher levels of global integration (i.e., average network strength and global efficiency) than nonresponders, and these measures predicted treatment effects after controlling for lesion volume and age. Group differences in local measures were identified in several regions associated with a variety of cognitive functions. These results suggest there is a meaningful and possibly prognostically-informative relationship between patients' functional network properties and their response to naming therapy.
PMID: 32505940 [PubMed - indexed for MEDLINE]
Why can deep convolutional neural networks improve protein fold recognition? A visual explanation by interpretation.
Why can deep convolutional neural networks improve protein fold recognition? A visual explanation by interpretation.
Brief Bioinform. 2021 Feb 04;:
Authors: Liu Y, Zhu YH, Song X, Song J, Yu DJ
Abstract
As an essential task in protein structure and function prediction, protein fold recognition has attracted increasing attention. The majority of the existing machine learning-based protein fold recognition approaches strongly rely on handcrafted features, which depict the characteristics of different protein folds; however, effective feature extraction methods still represent the bottleneck for further performance improvement of protein fold recognition. As a powerful feature extractor, deep convolutional neural network (DCNN) can automatically extract discriminative features for fold recognition without human intervention, which has demonstrated an impressive performance on protein fold recognition. Despite the encouraging progress, DCNN often acts as a black box, and as such, it is challenging for users to understand what really happens in DCNN and why it works well for protein fold recognition. In this study, we explore the intrinsic mechanism of DCNN and explain why it works for protein fold recognition using a visual explanation technique. More specifically, we first trained a VGGNet-based DCNN model, termed VGGNet-FE, which can extract fold-specific features from the predicted protein residue-residue contact map for protein fold recognition. Subsequently, based on the trained VGGNet-FE, we implemented a new contact-assisted predictor, termed VGGfold, for protein fold recognition; we then visualized what features were extracted by each of the convolutional layers in VGGNet-FE using a deconvolution technique. Furthermore, we visualized the high-level semantic information, termed fold-discriminative region, of a predicted contact map from the localization map obtained from the last convolutional layer of VGGNet-FE. It is visually confirmed that VGGNet-FE could effectively extract distinct fold-discriminative regions for different types of protein folds, thereby accounting for the improved performance of VGGfold for protein fold recognition. In summary, this study is of great significance for both understanding the working principle of DCNNs in protein fold recognition and exploring the relationship between the predicted protein contact map and protein tertiary structure. This proposed visualization method is flexible and applicable to address other DCNN-based bioinformatics and computational biology questions. The online web server of VGGfold is freely available at http://csbio.njust.edu.cn/bioinf/vggfold/.
PMID: 33537753 [PubMed - as supplied by publisher]
Seasonality of Back Pain in Italy: An Infodemiology Study.
Seasonality of Back Pain in Italy: An Infodemiology Study.
Int J Environ Res Public Health. 2021 Feb 01;18(3):
Authors: Ciaffi J, Meliconi R, Landini MP, Mancarella L, Brusi V, Faldini C, Ursini F
Abstract
BACKGROUND: E-health tools have been used to assess the temporal variations of different health problems. The aim of our infodemiology study was to investigate the seasonal pattern of search volumes for back pain in Italy.
METHODS: In Italian, back pain is indicated by the medical word "lombalgia". Using Google Trends, we selected the three search terms related to "lombalgia" with higher relative search volumes (RSV), (namely, "mal di schiena", "dolore alla schiena" and "dolore lombare"), representing the semantic preferences of users when performing web queries for back pain in Italy. Wikipedia page view statistics were used to identify the number of visits to the page "lombalgia". Strength and direction of secular trends were assessed using the Mann-Kendall test. Cosinor analysis was used to evaluate the potential seasonality of back pain-related RSV.
RESULTS: We found a significant upward secular trend from 2005 to 2020 for search terms "mal di schiena" (τ = 0.734, p < 0.0001), "dolore alla schiena" (τ = 0.713, p < 0.0001) and "dolore lombare" (τ = 0.628, p < 0.0001). Cosinor analysis on Google Trends RSV showed a significant seasonality for the terms "mal di schiena" (pcos < 0.001), "dolore alla schiena" (pcos < 0.0001), "dolore lombare" (pcos < 0.0001) and "lombalgia" (pcos = 0.017). Cosinor analysis performed on views for the page "lombalgia" in Wikipedia confirmed a significant seasonality (pcos < 0.0001). Both analyses demonstrated a peak of interest in winter months and decrease in spring/summer.
CONCLUSIONS: Our infodemiology approach revealed significant seasonal fluctuations in search queries for back pain in Italy, with peaking volumes during the coldest months of the year.
PMID: 33535709 [PubMed - in process]
Applications of weighted association networks applied to compositional data in biology.
Applications of weighted association networks applied to compositional data in biology.
Environ Microbiol. 2020 08;22(8):3020-3038
Authors: Espinoza JL, Shah N, Singh S, Nelson KE, Dupont CL
Abstract
Next-generation sequencing technologies have generated, and continue to produce, an increasingly large corpus of biological data. The data generated are inherently compositional as they convey only relative information dependent upon the capacity of the instrument, experimental design and technical bias. There is considerable information to be gained through network analysis by studying the interactions between components within a system. Network theory methods using compositional data are powerful approaches for quantifying relationships between biological components and their relevance to phenotype, environmental conditions or other external variables. However, many of the statistical assumptions used for network analysis are not designed for compositional data and can bias downstream results. In this mini-review, we illustrate the utility of network theory in biological systems and investigate modern techniques while introducing researchers to frameworks for implementation. We overview (1) compositional data analysis, (2) data transformations and (3) network theory along with insight on a battery of network types including static-, temporal-, sample-specific- and differential-networks. The intention of this mini-review is not to provide a comprehensive overview of network methods, rather to introduce microbiology researchers to (semi)-unsupervised data-driven approaches for inferring latent structures that may give insight into biological phenomena or abstract mechanics of complex systems.
PMID: 32436334 [PubMed - indexed for MEDLINE]
biotoolsSchema: a formalized schema for bioinformatics software description.
biotoolsSchema: a formalized schema for bioinformatics software description.
Gigascience. 2021 Jan 27;10(1):
Authors: Ison J, Ienasescu H, Rydza E, Chmura P, Rapacki K, Gaignard A, Schwämmle V, van Helden J, Kalaš M, Ménager H
Abstract
BACKGROUND: Life scientists routinely face massive and heterogeneous data analysis tasks and must find and access the most suitable databases or software in a jungle of web-accessible resources. The diversity of information used to describe life-scientific digital resources presents an obstacle to their utilization. Although several standardization efforts are emerging, no information schema has been sufficiently detailed to enable uniform semantic and syntactic description-and cataloguing-of bioinformatics resources.
FINDINGS: Here we describe biotoolsSchema, a formalized information model that balances the needs of conciseness for rapid adoption against the provision of rich technical information and scientific context. biotoolsSchema results from a series of community-driven workshops and is deployed in the bio.tools registry, providing the scientific community with >17,000 machine-readable and human-understandable descriptions of software and other digital life-science resources. We compare our approach to related initiatives and provide alignments to foster interoperability and reusability.
CONCLUSIONS: biotoolsSchema supports the formalized, rigorous, and consistent specification of the syntax and semantics of bioinformatics resources, and enables cataloguing efforts such as bio.tools that help scientists to find, comprehend, and compare resources. The use of biotoolsSchema in bio.tools promotes the FAIRness of research software, a key element of open and reproducible developments for data-intensive sciences.
PMID: 33506265 [PubMed - in process]
Use of the Systematized Nomenclature of Medicine Clinical Terms (SNOMED CT) for Processing Free Text in Health Care: Systematic Scoping Review.
Use of the Systematized Nomenclature of Medicine Clinical Terms (SNOMED CT) for Processing Free Text in Health Care: Systematic Scoping Review.
J Med Internet Res. 2021 Jan 26;23(1):e24594
Authors: Gaudet-Blavignac C, Foufi V, Bjelogrlic M, Lovis C
Abstract
BACKGROUND: Interoperability and secondary use of data is a challenge in health care. Specifically, the reuse of clinical free text remains an unresolved problem. The Systematized Nomenclature of Medicine Clinical Terms (SNOMED CT) has become the universal language of health care and presents characteristics of a natural language. Its use to represent clinical free text could constitute a solution to improve interoperability.
OBJECTIVE: Although the use of SNOMED and SNOMED CT has already been reviewed, its specific use in processing and representing unstructured data such as clinical free text has not. This review aims to better understand SNOMED CT's use for representing free text in medicine.
METHODS: A scoping review was performed on the topic by searching MEDLINE, Embase, and Web of Science for publications featuring free-text processing and SNOMED CT. A recursive reference review was conducted to broaden the scope of research. The review covered the type of processed data, the targeted language, the goal of the terminology binding, the method used and, when appropriate, the specific software used.
RESULTS: In total, 76 publications were selected for an extensive study. The language targeted by publications was 91% (n=69) English. The most frequent types of documents for which the terminology was used are complementary exam reports (n=18, 24%) and narrative notes (n=16, 21%). Mapping to SNOMED CT was the final goal of the research in 21% (n=16) of publications and a part of the final goal in 33% (n=25). The main objectives of mapping are information extraction (n=44, 39%), feature in a classification task (n=26, 23%), and data normalization (n=23, 20%). The method used was rule-based in 70% (n=53) of publications, hybrid in 11% (n=8), and machine learning in 5% (n=4). In total, 12 different software packages were used to map text to SNOMED CT concepts, the most frequent being Medtex, Mayo Clinic Vocabulary Server, and Medical Text Extraction Reasoning and Mapping System. Full terminology was used in 64% (n=49) of publications, whereas only a subset was used in 30% (n=23) of publications. Postcoordination was proposed in 17% (n=13) of publications, and only 5% (n=4) of publications specifically mentioned the use of the compositional grammar.
CONCLUSIONS: SNOMED CT has been largely used to represent free-text data, most frequently with rule-based approaches, in English. However, currently, there is no easy solution for mapping free text to this terminology and to perform automatic postcoordination. Most solutions conceive SNOMED CT as a simple terminology rather than as a compositional bag of ontologies. Since 2012, the number of publications on this subject per year has decreased. However, the need for formal semantic representation of free text in health care is high, and automatic encoding into a compositional ontology could be a solution.
PMID: 33496673 [PubMed - as supplied by publisher]
A Semantic-Based Approach for Managing Healthcare Big Data: A Survey.
A Semantic-Based Approach for Managing Healthcare Big Data: A Survey.
J Healthc Eng. 2020;2020:8865808
Authors: Hammad R, Barhoush M, Abed-Alguni BH
Abstract
Healthcare information systems can reduce the expenses of treatment, foresee episodes of pestilences, help stay away from preventable illnesses, and improve personal life satisfaction. As of late, considerable volumes of heterogeneous and differing medicinal services data are being produced from different sources covering clinic records of patients, lab results, and wearable devices, making it hard for conventional data processing to handle and manage this amount of data. Confronted with the difficulties and challenges facing the process of managing healthcare big data such as volume, velocity, and variety, healthcare information systems need to use new methods and techniques for managing and processing such data to extract useful information and knowledge. In the recent few years, a large number of organizations and companies have shown enthusiasm for using semantic web technologies with healthcare big data to convert data into knowledge and intelligence. In this paper, we review the state of the art on the semantic web for the healthcare industry. Based on our literature review, we will discuss how different techniques, standards, and points of view created by the semantic web community can participate in addressing the challenges related to healthcare big data.
PMID: 33489061 [PubMed - in process]
Small Semantic Networks in Individuals with Autism Spectrum Disorder Without Intellectual Impairment: A Verbal Fluency Approach.
Small Semantic Networks in Individuals with Autism Spectrum Disorder Without Intellectual Impairment: A Verbal Fluency Approach.
J Autism Dev Disord. 2020 Nov;50(11):3967-3987
Authors: Ehlen F, Roepke S, Klostermann F, Baskow I, Geise P, Belica C, Tiedt HO, Behnia B
Abstract
Individuals with Autism Spectrum Disorder (ASD) experience a variety of symptoms sometimes including atypicalities in language use. The study explored differences in semantic network organisation of adults with ASD without intellectual impairment. We assessed clusters and switches in verbal fluency tasks ('animals', 'human feature', 'verbs', 'r-words') via curve fitting in combination with corpus-driven analysis of semantic relatedness and evaluated socio-emotional and motor action related content. Compared to participants without ASD (n = 39), participants with ASD (n = 32) tended to produce smaller clusters, longer switches, and fewer words in semantic conditions (no p values survived Bonferroni-correction), whereas relatedness and content were similar. In ASD, semantic networks underlying cluster formation appeared comparably small without affecting strength of associations or content.
PMID: 32198662 [PubMed - indexed for MEDLINE]
An empirical meta-analysis of the life sciences linked open data on the web.
An empirical meta-analysis of the life sciences linked open data on the web.
Sci Data. 2021 Jan 21;8(1):24
Authors: Kamdar MR, Musen MA
Abstract
While the biomedical community has published several "open data" sources in the last decade, most researchers still endure severe logistical and technical challenges to discover, query, and integrate heterogeneous data and knowledge from multiple sources. To tackle these challenges, the community has experimented with Semantic Web and linked data technologies to create the Life Sciences Linked Open Data (LSLOD) cloud. In this paper, we extract schemas from more than 80 biomedical linked open data sources into an LSLOD schema graph and conduct an empirical meta-analysis to evaluate the extent of semantic heterogeneity across the LSLOD cloud. We observe that several LSLOD sources exist as stand-alone data sources that are not inter-linked with other sources, use unpublished schemas with minimal reuse or mappings, and have elements that are not useful for data integration from a biomedical perspective. We envision that the LSLOD schema graph and the findings from this research will aid researchers who wish to query and integrate data and knowledge from multiple biomedical sources simultaneously on the Web.
PMID: 33479214 [PubMed - in process]
Ethnomedicinal uses, phytochemistry, and biological activity of plants of the genus Gynura.
Ethnomedicinal uses, phytochemistry, and biological activity of plants of the genus Gynura.
J Ethnopharmacol. 2021 Jan 16;:113834
Authors: Bari MS, Khandokar L, Haque E, Romano B, Capasso R, Seidel V, Haque MA, Rashid MA
Abstract
ETHNOPHARMACOLOGICAL RELEVANCE: The genus Gynura (Compositae) includes around 46 species and is native to the tropical regions of Southeast Asia, Africa and Australia. Many species within this genus are used in ethnomedicine to treat various disorders including skin diseases, injuries, ulcers, wounds, burns, sores, scalds, as well as for the management of diabetes, hypertension, hyperlipidemia, constipation, rheumatism, bronchitis and inflammation.
AIM OF THE REVIEW: This review is an attempt to provide scientific information regarding the ethnopharmacology, phytochemistry, pharmacological and toxicological profiles of Gynura species along with the nomenclature, distribution, taxonomy and botanical features of the genus. A critical analysis has been undertaken to understand the current and future pharmaceutical prospects of the genus.
MATERIALS & METHODS: Several electronic databases, including Google scholar, PubMed, Web of Science, Scopus, ScienceDirect, SpringerLink, Semantic Scholar, MEDLINE and CNKI Scholar, were explored as information sources. The Plant List Index was used for taxonomical authentications. SciFinder and PubChem assisted in the verification of chemical structures.
RESULTS: A large number of phytochemical analyses on Gynura have revealed the presence of around 342 phytoconstituents including pyrrolizidine alkaloids, phenolic compounds, chromanones, phenylpropanoid glycosides, flavonoids, flavonoid glycosides, steroids, steroidal glycosides, cerebrosides, carotenoids, triterpenes, mono- and sesquiterpenes, norisoprenoids, oligosaccharides, polysaccharides and proteins. Several in vitro and in vivo studies have demonstrated the pharmacological potential of Gynura species, including antidiabetic, anti-oxidant, anti-inflammatory, antimicrobial, antihypertensive and anticancer activities. Although the presence of pyrrolizidine alkaloids within a few species has been associated with possible hepatotoxicity, most of the common species have a good safety profile.
CONCLUSIONS: The importance of the genus Gynura both as a prominent contributor in ethnomedicinal systems as well as a source of promising bioactive molecules is evident. Only about one fourth of Gynura species have been studied so far. This review aims to provide some scientific basis for future endeavors, including in-depth biological and chemical investigations into already studied species as well as other lesser known species of Gynura.
PMID: 33465439 [PubMed - as supplied by publisher]
Large-scale regulatory and signaling network assembly through linked open data.
Large-scale regulatory and signaling network assembly through linked open data.
Database (Oxford). 2021 Jan 18;2021:
Authors: Lefebvre M, Gaignard A, Folschette M, Bourdon J, Guziolowski C
Abstract
Huge efforts are currently underway to address the organization of biological knowledge through linked open databases. These databases can be automatically queried to reconstruct regulatory and signaling networks. However, assembling networks implies manual operations due to source-specific identification of biological entities and relationships, multiple life-science databases with redundant information and the difficulty of recovering logical flows in biological pathways. We propose a framework based on Semantic Web technologies to automate the reconstruction of large-scale regulatory and signaling networks in the context of tumor cells modeling and drug screening. The proposed tool is pyBRAvo (python Biological netwoRk Assembly), and here we have applied it to a dataset of 910 gene expression measurements issued from liver cancer patients. The tool is publicly available at https://github.com/pyBRAvo/pyBRAvo.
PMID: 33459761 [PubMed - in process]
Establishment and application of information resource of mutant mice in RIKEN BioResource Research Center.
Establishment and application of information resource of mutant mice in RIKEN BioResource Research Center.
Lab Anim Res. 2021 Jan 18;37(1):6
Authors: Masuya H, Usuda D, Nakata H, Yuhara N, Kurihara K, Namiki Y, Iwase S, Takada T, Tanaka N, Suzuki K, Yamagata Y, Kobayashi N, Yoshiki A, Kushida T
Abstract
Online databases are crucial infrastructures to facilitate the wide effective and efficient use of mouse mutant resources in life sciences. The number and types of mouse resources have been rapidly growing due to the development of genetic modification technology with associated information of genomic sequence and phenotypes. Therefore, data integration technologies to improve the findability, accessibility, interoperability, and reusability of mouse strain data becomes essential for mouse strain repositories. In 2020, the RIKEN BioResource Research Center released an integrated database of bioresources including, experimental mouse strains, Arabidopsis thaliana as a laboratory plant, cell lines, microorganisms, and genetic materials using Resource Description Framework-related technologies. The integrated database shows multiple advanced features for the dissemination of bioresource information. The current version of our online catalog of mouse strains which functions as a part of the integrated database of bioresources is available from search bars on the page of the Center ( https://brc.riken.jp ) and the Experimental Animal Division ( https://mus.brc.riken.jp/ ) websites. The BioResource Research Center also released a genomic variation database of mouse strains established in Japan and Western Europe, MoG+ ( https://molossinus.brc.riken.jp/mogplus/ ), and a database for phenotype-phenotype associations across the mouse phenome using data from the International Mouse Phenotyping Platform. In this review, we describe features of current version of databases related to mouse strain resources in RIKEN BioResource Research Center and discuss future views.
PMID: 33455583 [PubMed]
Enhancing web search result clustering model based on multiview multirepresentation consensus cluster ensemble (mmcc) approach.
Enhancing web search result clustering model based on multiview multirepresentation consensus cluster ensemble (mmcc) approach.
PLoS One. 2021;16(1):e0245264
Authors: Sabah A, Tiun S, Sani NS, Ayob M, Taha AY
Abstract
Existing text clustering methods utilize only one representation at a time (single view), whereas multiple views can represent documents. The multiview multirepresentation method enhances clustering quality. Moreover, existing clustering methods that utilize more than one representation at a time (multiview) use representation with the same nature. Hence, using multiple views that represent data in a different representation with clustering methods is reasonable to create a diverse set of candidate clustering solutions. On this basis, an effective dynamic clustering method must consider combining multiple views of data including semantic view, lexical view (word weighting), and topic view as well as the number of clusters. The main goal of this study is to develop a new method that can improve the performance of web search result clustering (WSRC). An enhanced multiview multirepresentation consensus clustering ensemble (MMCC) method is proposed to create a set of diverse candidate solutions and select a high-quality overlapping cluster. The overlapping clusters are obtained from the candidate solutions created by different clustering methods. The framework to develop the proposed MMCC includes numerous stages: (1) acquiring the standard datasets (MORESQUE and Open Directory Project-239), which are used to validate search result clustering algorithms, (2) preprocessing the dataset, (3) applying multiview multirepresentation clustering models, (4) using the radius-based cluster number estimation algorithm, and (5) employing the consensus clustering ensemble method. Results show an improvement in clustering methods when multiview multirepresentation is used. More importantly, the proposed MMCC model improves the overall performance of WSRC compared with all single-view clustering models.
PMID: 33449949 [PubMed - as supplied by publisher]
Diverse Taxonomies for Diverse Chemistries: Enhanced Representation of Natural Product Metabolism in UniProtKB.
Diverse Taxonomies for Diverse Chemistries: Enhanced Representation of Natural Product Metabolism in UniProtKB.
Metabolites. 2021 Jan 12;11(1):
Authors: Feuermann M, Boutet E, Morgat A, Axelsen KB, Bansal P, Bolleman J, de Castro E, Coudert E, Gasteiger E, Géhant S, Lieberherr D, Lombardot T, Neto TB, Pedruzzi I, Poux S, Pozzato M, Redaschi N, Bridge A, On Behalf Of The UniProt Consortium
Abstract
The UniProt Knowledgebase UniProtKB is a comprehensive, high-quality, and freely accessible resource of protein sequences and functional annotation that covers genomes and proteomes from tens of thousands of taxa, including a broad range of plants and microorganisms producing natural products of medical, nutritional, and agronomical interest. Here we describe work that enhances the utility of UniProtKB as a support for both the study of natural products and for their discovery. The foundation of this work is an improved representation of natural product metabolism in UniProtKB using Rhea, an expert-curated knowledgebase of biochemical reactions, that is built on the ChEBI (Chemical Entities of Biological Interest) ontology of small molecules. Knowledge of natural products and precursors is captured in ChEBI, enzyme-catalyzed reactions in Rhea, and enzymes in UniProtKB/Swiss-Prot, thereby linking chemical structure data directly to protein knowledge. We provide a practical demonstration of how users can search UniProtKB for protein knowledge relevant to natural products through interactive or programmatic queries using metabolite names and synonyms, chemical identifiers, chemical classes, and chemical structures and show how to federate UniProtKB with other data and knowledge resources and tools using semantic web technologies such as RDF and SPARQL. All UniProtKB data are freely available for download in a broad range of formats for users to further mine or exploit as an annotation source, to enrich other natural product datasets and databases.
PMID: 33445429 [PubMed]
A computational approach to predict multi-pathway drug-drug interactions: A case study of irinotecan, a colon cancer medication.
A computational approach to predict multi-pathway drug-drug interactions: A case study of irinotecan, a colon cancer medication.
Saudi Pharm J. 2020 Dec;28(12):1507-1513
Authors: Assiri A, Noor A
Abstract
Drug-drug interactions (DDIs) are a potentially distressing corollary of drug interventions, and may result in discomfort, debilitating illness, or even death. Existing research predominantly considers only a single level of interaction; however, serious health complications may result from multi-pathway DDIs, and so new methods are needed to enable predicting and preventing complex DDIs. This article introduces a novel method for the prediction of DDIs at two pharmacological levels (metabolic and transporter interactions) by means of a rule-based model implemented with Semantic Web technologies. The chemotherapy agent irinotecan is used as a case study for demonstrating the validity of this approach. Mechanistic and interaction data were mined from available sources and then used to predict interactors of irinotecan, including potential DDIs mediated by previously unidentified mechanisms. The findings also draw attention to the profound variation between DDI resources, indicating that clinical practice would see significant value from the development of an evidence-based resource to support DDI identification.
PMID: 33424244 [PubMed]
Big data augmentated business trend identification: the case of mobile commerce.
Big data augmentated business trend identification: the case of mobile commerce.
Scientometrics. 2021 Jan 05;:1-27
Authors: Saritas O, Bakhtin P, Kuzminov I, Khabirova E
Abstract
Identifying and monitoring business and technological trends are crucial for innovation and competitiveness of businesses. Exponential growth of data across the world is invaluable for identifying emerging and evolving trends. On the other hand, the vast amount of data leads to information overload and can no longer be adequately processed without the use of automated methods of extraction, processing, and generation of knowledge. There is a growing need for information systems that would monitor and analyse data from heterogeneous and unstructured sources in order to enable timely and evidence-based decision-making. Recent advancements in computing and big data provide enormous opportunities for gathering evidence on future developments and emerging opportunities. The present study demonstrates the use of text-mining and semantic analysis of large amount of documents for investigating in business trends in mobile commerce (m-commerce). Particularly with the on-going COVID-19 pandemic and resultant social isolation, m-commerce has become a large technology and business domain with ever growing market potentials. Thus, our study begins with a review of global challenges, opportunities and trends in the development of m-commerce in the world. Next, the study identifies critical technologies and instruments for the full utilization of the potentials in the sector by using the intelligent big data analytics system based on in-depth natural language processing utilizing text-mining, machine learning, science bibliometry and technology analysis. The results generated by the system can be used to produce a comprehensive and objective web of interconnected technologies, trends, drivers and barriers to give an overview of the whole landscape of m-commerce in one business intelligence (BI) data mart diagram.
PMID: 33424052 [PubMed - as supplied by publisher]
SNAFU: The Semantic Network and Fluency Utility.
SNAFU: The Semantic Network and Fluency Utility.
Behav Res Methods. 2020 08;52(4):1681-1699
Authors: Zemla JC, Cao K, Mueller KD, Austerweil JL
Abstract
The verbal fluency task-listing words from a category or words that begin with a specific letter-is a common experimental paradigm that is used to diagnose memory impairments and to understand how we store and retrieve knowledge. Data from the verbal fluency task are analyzed in many different ways, often requiring manual coding that is time intensive and error-prone. Researchers have also used fluency data from groups or individuals to estimate semantic networks-latent representations of semantic memory that describe the relations between concepts-that further our understanding of how knowledge is encoded. However computational methods used to estimate networks are not standardized and can be difficult to implement, which has hindered widespread adoption. We present SNAFU: the Semantic Network and Fluency Utility, a tool for estimating networks from fluency data and automatizing traditional fluency analyses, including counting cluster switches and cluster sizes, intrusions, perseverations, and word frequencies. In this manuscript, we provide a primer on using the tool, illustrate its application by creating a semantic network for foods, and validate the tool by comparing results to trained human coders using multiple datasets.
PMID: 32128696 [PubMed - indexed for MEDLINE]
Treatable but not curable cancer in England: a retrospective cohort study using cancer registry data and linked data sets
BMJ Open. 2021 Jan 8;11(1):e040808. doi: 10.1136/bmjopen-2020-040808.
ABSTRACT
OBJECTIVES: This study estimates the prevalence of cancers that are categorised as treatable but not curable (TbnC) in England. It provides a quantification of the population and a framework to aid identification of this group to enable the design of tailored support services.
DESIGN: Through consultation with clinical and data experts an algorithmic definition of TbnC was developed. Using cancer registry data sets, with five other linked data sets held by the National Disease Registration Service, the algorithm was applied as part of this retrospective cohort study to estimate the size and characteristics of the TbnC population.
SETTING AND PARTICIPANTS: The health data records of 1.6 million people living with cancer in England in 2015, following a cancer diagnosis between 2001 and 2015, were retrospectively assessed for TbnC status.
RESULTS: An estimated 110 615 people in England were living with TbnC cancer at the end of 2015, following identification of TbnC cancer between 2012 and 2015. In addition, 51 946 people fit the initial search criteria but were found to have been in their last year of life at the end of 2015 and therefore considered separately here as end of life cases. A further 57 117 people in England were initially identified as being at high risk of recurrence or having their life being shortened by cancer but did not fit the TbnC conceptual framework and were excluded, but their results are also reported under 'group B'.
CONCLUSIONS: A population living with TbnC cancer can be identified using data currently collected on a national scale in England. This large population living with TbnC cancer requires personalised treatment and support.
PMID:33419907 | PMC:PMC7798682 | DOI:10.1136/bmjopen-2020-040808
An ontology-based approach for developing a harmonised data-validation tool for European cancer registration.
An ontology-based approach for developing a harmonised data-validation tool for European cancer registration.
J Biomed Semantics. 2021 Jan 06;12(1):1
Authors: Nicholson NC, Giusti F, Bettio M, Negrao Carvalho R, Dimitrova N, Dyba T, Flego M, Neamtiu L, Randi G, Martos C
Abstract
BACKGROUND: Population-based cancer registries constitute an important information source in cancer epidemiology. Studies collating and comparing data across regional and national boundaries have proved important for deploying and evaluating effective cancer-control strategies. A critical aspect in correctly comparing cancer indicators across regional and national boundaries lies in ensuring a good and harmonised level of data quality, which is a primary motivator for a centralised collection of pseudonymised data. The recent introduction of the European Union's general data-protection regulation (GDPR) imposes stricter conditions on the collection, processing, and sharing of personal data. It also considers pseudonymised data as personal data. The new regulation motivates the need to find solutions that allow a continuation of the smooth processes leading to harmonised European cancer-registry data. One element in this regard would be the availability of a data-validation software tool based on a formalised depiction of the harmonised data-validation rules, allowing an eventual devolution of the data-validation process to the local level.
RESULTS: A semantic data model was derived from the data-validation rules for harmonising cancer-data variables at European level. The data model was encapsulated in an ontology developed using the Web-Ontology Language (OWL) with the data-model entities forming the main OWL classes. The data-validation rules were added as axioms in the ontology. The reasoning function of the resulting ontology demonstrated its ability to trap registry-coding errors and in some instances to be able to correct errors.
CONCLUSIONS: Describing the European cancer-registry core data set in terms of an OWL ontology affords a tool based on a formalised set of axioms for validating a cancer-registry's data set according to harmonised, supra-national rules. The fact that the data checks are inherently linked to the data model would lead to less maintenance overheads and also allow automatic versioning synchronisation, important for distributed data-quality checking processes.
PMID: 33407816 [PubMed - in process]
A Network Analysis of Research Topics and Trends in End-of-Life Care and Nursing.
A Network Analysis of Research Topics and Trends in End-of-Life Care and Nursing.
Int J Environ Res Public Health. 2021 Jan 04;18(1):
Authors: Kim K, Jang SG, Lee KS
Abstract
This study identified the trends in end-of-life care and nursing through text network analysis. About 18,935 articles published until September 2019 were selected through searches on PubMed, Embase, Cochrane, Web of Science, and Cumulative Index to Nursing and Allied Health Literature. For topic modeling, Latent Dirichlet Allocation (K = 8) was applied. Most of the top ranked topic words for the degree and betweenness centralities were consistent with the top 1% through the semantic network diagram. Among the important keywords examined every five years, "care" was unrivaled. When analyzing the two- and three-word combinations, there were many themes representing places, roles, and actions. As a result of performing topic modeling, eight topics were derived as ethical issues of decision-making for treatment withdrawal, symptom management to improve the quality of life, development of end-of-life knowledge education programs, life-sustaining care plan for elderly patients, home-based hospice, communication experience, patient symptom investigation, and an analysis of considering patient preferences. This study is meaningful as it analyzed a large amount of existing literature and considered the main trends of end-of-life care and nursing research based on the core subject control and semantic structure.
PMID: 33406715 [PubMed - in process]