Semantic Web
The 2021 update of the EPA's adverse outcome pathway database
Sci Data. 2021 Jul 12;8(1):169. doi: 10.1038/s41597-021-00962-3.
ABSTRACT
The EPA developed the Adverse Outcome Pathway Database (AOP-DB) to better characterize adverse outcomes of toxicological interest that are relevant to human health and the environment. Here we present the most recent version of the EPA Adverse Outcome Pathway Database (AOP-DB), version 2. AOP-DB v.2 introduces several substantial updates, which include automated data pulls from the AOP-Wiki 2.0, the integration of tissue-gene network data, and human AOP-gene data by population, semantic mapping and SPARQL endpoint creation, in addition to the presentation of the first publicly available AOP-DB web user interface. Potential users of the data may investigate specific molecular targets of an AOP, the relation of those gene/protein targets to other AOPs, cross-species, pathway, or disease-AOP relationships, or frequencies of AOP-related functional variants in particular populations, for example. Version updates described herein help inform new testable hypotheses about the etiology and mechanisms underlying adverse outcomes of environmental and toxicological concern.
PMID:34253739 | DOI:10.1038/s41597-021-00962-3
The SPARC DRC: Building a Resource for the Autonomic Nervous System Community
Front Physiol. 2021 Jun 24;12:693735. doi: 10.3389/fphys.2021.693735. eCollection 2021.
ABSTRACT
The Data and Resource Center (DRC) of the NIH-funded SPARC program is developing databases, connectivity maps, and simulation tools for the mammalian autonomic nervous system. The experimental data and mathematical models supplied to the DRC by the SPARC consortium are curated, annotated and semantically linked via a single knowledgebase. A data portal has been developed that allows discovery of data and models both via semantic search and via an interface that includes Google Map-like 2D flatmaps for displaying connectivity, and 3D anatomical organ scaffolds that provide a common coordinate framework for cross-species comparisons. We discuss examples that illustrate the data pipeline, which includes data upload, curation, segmentation (for image data), registration against the flatmaps and scaffolds, and finally display via the web portal, including the link to freely available online computational facilities that will enable neuromodulation hypotheses to be investigated by the autonomic neuroscience community and device manufacturers.
PMID:34248680 | PMC:PMC8265045 | DOI:10.3389/fphys.2021.693735
The impact of semantics on aspect level opinion mining
PeerJ Comput Sci. 2021 Jun 18;7:e558. doi: 10.7717/peerj-cs.558. eCollection 2021.
ABSTRACT
Recently, many users prefer online shopping to purchase items from the web. Shopping websites allow customers to submit comments and provide their feedback for the purchased products. Opinion mining and sentiment analysis are used to analyze products' comments to help sellers and purchasers decide to buy products or not. However, the nature of online comments affects the performance of the opinion mining process because they may contain negation words or unrelated aspects to the product. To address these problems, a semantic-based aspect level opinion mining (SALOM) model is proposed. The SALOM extracts the product aspects based on the semantic similarity and classifies the comments. The proposed model considers the negation words and other types of product aspects such as aspects' synonyms, hyponyms, and hypernyms to improve the accuracy of classification. Three different datasets are used to evaluate the proposed SALOM. The experimental results are promising in terms of Precision, Recall, and F-measure. The performance reaches 94.8% precision, 93% recall, and 92.6% f-measure.
PMID:34239969 | PMC:PMC8237320 | DOI:10.7717/peerj-cs.558
A novel computational drug repurposing approach for Systemic Lupus Erythematosus (SLE) treatment using Semantic Web technologies
Saudi J Biol Sci. 2021 Jul;28(7):3886-3892. doi: 10.1016/j.sjbs.2021.03.068. Epub 2021 Apr 2.
NO ABSTRACT
PMID:34220244 | PMC:PMC8241633 | DOI:10.1016/j.sjbs.2021.03.068
Trends in Nursing Research on Infections: Semantic Network Analysis and Topic Modeling
Int J Environ Res Public Health. 2021 Jun 28;18(13):6915. doi: 10.3390/ijerph18136915.
ABSTRACT
BACKGROUND: Many countries around the world are currently threatened by the COVID-19 pandemic, and nurses are facing increasing responsibilities and work demands related to infection control. To establish a developmental strategy for infection control, it is important to analyze, understand, or visualize the accumulated data gathered from research in the field of nursing.
METHODS: A total of 4854 articles published between 1978 and 2017 were retrieved from the Web of Science. Abstracts from these articles were extracted, and network analysis was conducted using the semantic network module.
RESULTS: 'wound', 'injury', 'breast', "dressing", 'temperature', 'drainage', 'diabetes', 'abscess', and 'cleaning' were identified as the keywords with high values of degree centrality, betweenness centrality, and closeness centrality; hence, they were determined to be influential in the network. The major topics were 'PLWH' (people living with HIV), 'pregnancy', and 'STI' (sexually transmitted infection).
CONCLUSIONS: Diverse infection research has been conducted on the topics of blood-borne infections, sexually transmitted infections, respiratory infections, urinary tract infections, and bacterial infections. STIs (including HIV), pregnancy, and bacterial infections have been the focus of particularly intense research by nursing researchers. More research on viral infections, urinary tract infections, immune topic, and hospital-acquired infections will be needed.
PMID:34203191 | DOI:10.3390/ijerph18136915
DUI: the drug use insights web server
Bioinformatics. 2021 Jun 23:btab461. doi: 10.1093/bioinformatics/btab461. Online ahead of print.
ABSTRACT
MOTIVATION: Substance abuse constitutes one of the major contemporary health epidemics. Recently, the use of social media platforms has garnered interest as a novel source of data for drug addiction epidemiology. Often however, the language used in such forums comprises slang and jargon. Currently, there are no publicly available resources to automatically analyse the esoteric language-use in the social media drug-use sub-culture. This lacunae introduces critical challenges for interpreting, sensemaking and modeling of addiction epidemiology using social media.
RESULTS: Drug-Use Insights (DUI) is a public and open-source web application to address the aforementioned deficiency. DUI is underlined by a hierarchical taxonomy encompassing 108 different addiction related categories consisting of over 9,000 terms, where each category encompasses a set of semantically related terms. These categories and terms were established by utilizing thematic analysis in conjunction with term embeddings generated from 7,472,545 Reddit posts made by 1,402,017 redditors. Given post(s) from social media forums such as Reddit and Twitter, DUI can be used foremost to identify constituent terms related to drug use. Furthermore, the DUI categories and integrated visualization tools can be leveraged for semantic- and exploratory analysis. To the best of our knowledge, DUI utilizes the largest number of substance use and recovery social media posts used in a study and represents the first significant online taxonomy of drug abuse terminology.
AVAILABILITY: The DUI web server and source code are available at: http://haddock9.sfsu.edu/insight/.
SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
PMID:34164647 | DOI:10.1093/bioinformatics/btab461
LSTMCNNsucc: A Bidirectional LSTM and CNN-Based Deep Learning Method for Predicting Lysine Succinylation Sites
Biomed Res Int. 2021 May 28;2021:9923112. doi: 10.1155/2021/9923112. eCollection 2021.
ABSTRACT
Lysine succinylation is a typical protein post-translational modification and plays a crucial role of regulation in the cellular process. Identifying succinylation sites is fundamental to explore its functions. Although many computational methods were developed to deal with this challenge, few considered semantic relationship between residues. We combined long short-term memory (LSTM) and convolutional neural network (CNN) into a deep learning method for predicting succinylation site. The proposed method obtained a Matthews correlation coefficient of 0.2508 on the independent test, outperforming state of the art methods. We also performed the enrichment analysis of succinylation proteins. The results showed that functions of succinylation were conserved across species but differed to a certain extent with species. On basis of the proposed method, we developed a user-friendly web server for predicting succinylation sites.
PMID:34159204 | PMC:PMC8188601 | DOI:10.1155/2021/9923112
TextEssence: A Tool for Interactive Analysis of Semantic Shifts Between Corpora
Proc Conf. 2021 Jun;2021:106-115.
ABSTRACT
Embeddings of words and concepts capture syntactic and semantic regularities of language; however, they have seen limited use as tools to study characteristics of different corpora and how they relate to one another. We introduce TextEssence, an interactive system designed to enable comparative analysis of corpora using embeddings. TextEssence includes visual, neighbor-based, and similarity-based modes of embedding analysis in a lightweight, web-based interface. We further propose a new measure of embedding confidence based on nearest neighborhood overlap, to assist in identifying high-quality embeddings for corpus analysis. A case study on COVID-19 scientific literature illustrates the utility of the system. TextEssence can be found at https://textessence.github.io.
PMID:34151319 | PMC:PMC8212692
Oral and written communication skills of adolescents with prenatal alcohol exposure (PAE) compared with those with no/low PAE: A systematic review
Int J Lang Commun Disord. 2021 Jun 16. doi: 10.1111/1460-6984.12644. Online ahead of print.
ABSTRACT
BACKGROUND: Prenatal alcohol exposure (PAE) is associated with growth deficits and neurodevelopmental impairment including foetal alcohol spectrum disorder (FASD). Difficulties with oral and written communication skills are common among children with PAE; however, less is known about how communication skills of adolescents who have PAE compare with those who do not. Adolescence is a critical time for development, supporting the transition into adulthood, but it is considered a high-risk period for those with FASD.
AIMS: We conducted a systematic review to synthesize evidence regarding oral and written communication skills of adolescents with PAE or FASD and how they compare with those with no PAE.
METHODS & PROCEDURES: A comprehensive search strategy used seven databases: Cochrane Library, Cinahl, Embase, Medline, PsycInfo, Eric and Web of Science. Included studies reported on at least one outcome related to oral and written communication for a PAE (or FASD) group as well as a no/low PAE group, both with age ranges of 10-24 years. Quality assessment was undertaken.
MAIN CONTRIBUTION: Communication skills most often assessed in the seven studies included in this review were semantic knowledge, semantic processing, and verbal learning and memory. These communication skills, in addition to reading and spelling, were commonly weaker among adolescents with PAE compared with those with no/low PAE. However, the findings were inconsistent across studies, and studies differed in their methodologies.
CONCLUSIONS & IMPLICATIONS: Our results emphasize that for adolescents with PAE, communication skills in both oral and written modalities should be comprehensively understood in assessment and when planning interventions. A key limitation of the existing literature is that comparison groups often include some participants with a low level of PAE, and that PAE definitions used to allocate participants to groups differ across studies.
WHAT THIS PAPER ADDS: What is already known on the subject PAE and FASD are associated with deficits in oral and written communication skills. Studies to date have mostly focused on children with a FASD diagnosis as well as combined groups of children and adolescents with FASD or PAE. There is a gap in what is known about oral and written communication skills of adolescents, specifically, who have PAE or FASD. This has implications for the provision of assessment and supports during a period of increased social and academic demands. What this study adds to existing knowledge This review provides systematic identification, assessment and synthesis of the current literature related to oral and written communication skills of adolescents with PAE compared with those with no/low PAE. The review revealed a small knowledge base with inconsistent methodologies and findings across studies. However, the findings overall highlight that adolescents with PAE have weaker skills in oral and written language than those with no/low PAE. Results are discussed in relation to education, social and emotional well-being, and forensic contexts. What are the potential or actual clinical implications of this work? Findings emphasize that for adolescents with PAE, comprehensive assessment of both oral and written communication skills, through both standardized and functional tasks, should be undertaken. Speech-language pathologists have a key role in assessment with individuals who have PAE.
PMID:34137136 | DOI:10.1111/1460-6984.12644
Ethnomedicinal uses, phytochemistry, pharmacological activities and toxicological profile of Glycosmis pentaphylla (Retz.) DC.: A review
J Ethnopharmacol. 2021 Jun 8:114313. doi: 10.1016/j.jep.2021.114313. Online ahead of print.
ABSTRACT
ETHNOPHARMACOLOGICAL RELEVANCE: Glycosmis pentaphylla (Retz.) DC. is a perennial shrub indigenous to the tropical and subtropical regions of India, China, Sri Lanka, Myanmar, Bangladesh, Indonesia, Malaysia, Thailand, Vietnam, Philippine, Java, Sumatra, Borneo and Australia. The plant is used extensively within these regions as a traditional medicine for the treatment of a variety of ailments including cough, fever, chest pain, anemia, jaundice, liver disorders, inflammation, bronchitis, rheumatism, urinary tract infections, pain, bone fractures, toothache, gonorrhea, diabetes, cancer and other chronic diseases.
AIM OF THE REVIEW: This review aims to present up-to-date information regarding the taxonomy, botany, distribution, ethnomedicinal uses, phytochemistry, pharmacology and toxicological profile of G. pentaphylla. The presented information was analyzed critically to understand current work undertaken on this species and explore possible future prospects for this plant in pharmaceutical research.
MATERIALS & METHODS: Bibliographic databases, including Google Scholar, PubMed, Web of Science, ScienceDirect, SpringerLink, Wiley Online Library, Semantic Scholar, Europe PMC, Scopus, and MEDLINE, were explored thoroughly for the collection of relevant information. The structures of phytoconstituents were confirmed with PubChem and SciFinder databases. Taxonomical information on the plant was presented in accordance with The Plant List (version 1.1).
RESULTS: Extensive phytochemical investigations into different parts of G. pentaphylla have revealed the presence of at least 354 secondary metabolites belonging to structurally diverse classes including alkaloids, amides, phenolic compounds, flavonoids, glycosides, aromatic compounds, steroids, terpenoids, and fatty derivatives. A large number of in vitro and in vivo experiments have demonstrated that G. pentaphylla had anticancer, antimutagenic, antibacterial, antifungal, anthelmintic, mosquitocidal, antidiabetic, antihyperlipidemic, anti-oxidant, anti-inflammatory, analgesic, antipyretic, anti-arsenicosis, and wound healing properties. Toxicological studies have established the absence of any significant adverse reactions and showed that the plant had a moderate safety profile.
CONCLUSIONS: G. pentaphylla can be suggested as a source of inspiration for the development of novel drugs, especially anticancer, antimicrobial, anthelmintic, and mosquitocidal agents. Moreover, bioassay-guided investigations into its diverse classes of secondary metabolites, especially the large pool of nitrogen-containing alkaloids and amides, promises the development of novel drug candidates. Future pharmacological studies into this species are also warranted as many of its traditional uses are yet to be validated scientifically.
PMID:34116186 | DOI:10.1016/j.jep.2021.114313
A Care Knowledge Management System Based on an Ontological Model of Caring for People With Dementia: Knowledge Representation and Development Study
J Med Internet Res. 2021 Jun 8;23(6):e25968. doi: 10.2196/25968.
ABSTRACT
BACKGROUND: Caregivers of people with dementia find it extremely difficult to choose the best care method because of complex environments and the variable symptoms of dementia. To alleviate this care burden, interventions have been proposed that use computer- or web-based applications. For example, an automatic diagnosis of the condition can improve the well-being of both the person with dementia and the caregiver. Other interventions support the individual with dementia in living independently.
OBJECTIVE: The aim of this study was to develop an ontology-based care knowledge management system for people with dementia that will provide caregivers with a care guide suited to the environment and to the individual patient's symptoms. This should also enable knowledge sharing among caregivers.
METHODS: To build the care knowledge model, we reviewed existing ontologies that contain concepts and knowledge descriptions relating to the care of those with dementia, and we considered dementia care manuals. The basic concepts of the care ontology were confirmed by experts in Korea. To infer the different care methods required for the individual dementia patient, the reasoning rules as defined in Semantic Web Rule Languages and Prolog were utilized. The accuracy of the care knowledge in the ontological model and the usability of the proposed system were evaluated by using the Pellet reasoner and OntOlogy Pitfall Scanner!, and a survey and interviews were conducted with caregivers working in care centers in Korea.
RESULTS: The care knowledge model contains six top-level concepts: care knowledge, task, assessment, person, environment, and medical knowledge. Based on this ontological model of dementia care, caregivers at a dementia care facility in Korea were able to access the care knowledge easily through a graphical user interface. The evaluation by the care experts showed that the system contained accurate care knowledge and a level of assessment comparable to normal assessment tools.
CONCLUSIONS: In this study, we developed a care knowledge system that can provide caregivers with care guides suited to individuals with dementia. We anticipate that the system could reduce the workload of caregivers.
PMID:34100762 | DOI:10.2196/25968
Intrapartum interventions and outcomes for women and children following induction of labour at term in uncomplicated pregnancies: a 16-year population-based linked data study
BMJ Open. 2021 May 31;11(6):e047040. doi: 10.1136/bmjopen-2020-047040.
ABSTRACT
OBJECTIVES: We compared intrapartum interventions and outcomes for mothers, neonates and children up to 16 years, for induction of labour (IOL) versus spontaneous labour onset in uncomplicated term pregnancies with live births.
DESIGN: We used population linked data from New South Wales, Australia (2001-2016) for healthy women giving birth at 37+0 to 41+6 weeks. Descriptive statistics and logistic regression were performed for intrapartum interventions, postnatal maternal and neonatal outcomes, and long-term child outcomes adjusted for maternal age, country of birth, socioeconomic status, parity and gestational age.
RESULTS: Of 474 652 included births, 69 397 (15%) had an IOL for non-medical reasons. Primiparous women with IOL versus spontaneous onset differed significantly for: spontaneous vaginal birth (42.7% vs 62.3%), instrumental birth (28.0% vs 23.9%%), intrapartum caesarean section (29.3% vs 13.8%), epidural (71.0% vs 41.3%), episiotomy (41.2% vs 30.5%) and postpartum haemorrhage (2.4% vs 1.5%). There was a similar trend in outcomes for multiparous women, except for caesarean section which was lower (5.3% vs 6.2%). For both groups, third and fourth degree perineal tears were lower overall in the IOL group: primiparous women (4.2% vs 4.9%), multiparous women (0.7% vs 1.2%), though overall vaginal repair was higher (89.3% vs 84.3%). Following induction, incidences of neonatal birth trauma, resuscitation and respiratory disorders were higher, as were admissions to hospital for infections (ear, nose, throat, respiratory and sepsis) up to 16 years. There was no difference in hospitalisation for asthma or eczema, or for neonatal death (0.06% vs 0.08%), or in total deaths up to 16 years.
CONCLUSION: IOL for non-medical reasons was associated with higher birth interventions, particularly in primiparous women, and more adverse maternal, neonatal and child outcomes for most variables assessed. The size of effect varied by parity and gestational age, making these important considerations when informing women about the risks and benefits of IOL.
PMID:34059509 | PMC:PMC8169493 | DOI:10.1136/bmjopen-2020-047040
ISO 21526 Conform Metadata Editor for FAIR Unicode SKOS Thesauri
Stud Health Technol Inform. 2021 May 24;278:94-100. doi: 10.3233/SHTI210056.
ABSTRACT
Metadata repositories are an indispensable component of data integration infrastructures and support semantic interoperability between knowledge organization systems. Standards for metadata representation like the ISO/IEC 11179 as well as the Resource Description Framework (RDF) and the Simple Knowledge Organization System (SKOS) by the World Wide Web Consortium were published to ensure metadata interoperability, maintainability and sustainability. The FAIR guidelines were composed to explicate those aspects in four principles divided in fifteen sub-principles. The ISO/IEC 21526 standard extends the 11179 standard for the domain of health care and mandates that SKOS be used for certain scenarios. In medical informatics, the composition of health care SKOS classification schemes is often managed by documentalists and data scientists. They use editors, which support them in producing comprehensive and valid metadata. Current metadata editors either do not properly support the SKOS resource annotations, require server applications or make use of additional databases for metadata storage. These characteristics are contrary to the application independency and versatility of raw Unicode SKOS files, e.g. the custom text arrangement, extensibility or copy & paste editing. We provide an application that adds navigation, auto completion and validity check capabilities on top of a regular Unicode text editor.
PMID:34042881 | DOI:10.3233/SHTI210056
Health Informatics Learning Objectives on an Interoperable, Collaborative Platform
Stud Health Technol Inform. 2021 May 27;281:1019-1020. doi: 10.3233/SHTI210335.
ABSTRACT
Catalogues of learning objectives for Biomedical and Health Informatics are relevant prerequisites for systematic and effective qualification. Catalogue management needs to integrate different catalogues and support collaborative revisioning. The Health Informatics Learning Objectives Navigator (HI-LONa) offers an open, interoperable platform based on Semantic Web Technology. At present HI-LONa contains 983 learning objectives of three relevant catalogues. HI-LONa successfully supported a multiprofessional consensus process.
PMID:34042830 | DOI:10.3233/SHTI210335
COVID-19 preVIEW: Semantic Search to Explore COVID-19 Research Preprints
Stud Health Technol Inform. 2021 May 27;281:78-82. doi: 10.3233/SHTI210124.
ABSTRACT
During the current COVID-19 pandemic, the rapid availability of profound information is crucial in order to derive information about diagnosis, disease trajectory, treatment or to adapt the rules of conduct in public. The increased importance of preprints for COVID-19 research initiated the design of the preprint search engine preVIEW. Conceptually, it is a lightweight semantic search engine focusing on easy inclusion of specialized COVID-19 textual collections and provides a user friendly web interface for semantic information retrieval. In order to support semantic search functionality, we integrated a text mining workflow for indexing with relevant terminologies. Currently, diseases, human genes and SARS-CoV-2 proteins are annotated, and more will be added in future. The system integrates collections from several different preprint servers that are used in the biomedical domain to publish non-peer-reviewed work, thereby enabling one central access point for the users. In addition, our service offers facet searching, export functionality and an API access. COVID-19 preVIEW is publicly available at https://preview.zbmed.de.
PMID:34042709 | DOI:10.3233/SHTI210124
Ontology-Based Personalized Cognitive Behavioural Plans for Patients with Mild Depression
Stud Health Technol Inform. 2021 May 27;281:729-733. doi: 10.3233/SHTI210268.
ABSTRACT
Cognitive Behavioural Therapy (CBT) is an action-oriented psychotherapy that combines cognitive and behavioural techniques for psychosocial treatment for depression, and is considered by many to be the golden standard in psychotherapy. More recently, computerized CBT (CCBT) has been deployed to help increase availability and access to this evidence-based therapy. In this vein, a CBT ontology, as a shared common understanding of the domain, can facilitate the aggregation, verification, and operationalization of computerized CBT knowledge. Moreover, as opposed to black-box applications, ontology-enabled systems allow recommended, evidence-based treatment interventions to be traced back to the corresponding psychological concepts. We used a Knowledge Management approach to synthesize and computerize CBT knowledge from multiple sources into a CBT ontology, which allows generating personalized action plans for treating mild depression, using the Web Ontology Language (OWL) and Semantic Web Rule Language (SWRL). We performed a formative evaluation of the CBT ontology in terms of its completeness, consistency, and conciseness.
PMID:34042672 | DOI:10.3233/SHTI210268
Leveraging Genetic Reports and Electronic Health Records for the Prediction of Primary Cancers: Algorithm Development and Validation Study
JMIR Med Inform. 2021 May 25;9(5):e23586. doi: 10.2196/23586.
ABSTRACT
BACKGROUND: Precision oncology has the potential to leverage clinical and genomic data in advancing disease prevention, diagnosis, and treatment. A key research area focuses on the early detection of primary cancers and potential prediction of cancers of unknown primary in order to facilitate optimal treatment decisions.
OBJECTIVE: This study presents a methodology to harmonize phenotypic and genetic data features to classify primary cancer types and predict cancers of unknown primaries.
METHODS: We extracted genetic data elements from oncology genetic reports of 1011 patients with cancer and their corresponding phenotypical data from Mayo Clinic's electronic health records. We modeled both genetic and electronic health record data with HL7 Fast Healthcare Interoperability Resources. The semantic web Resource Description Framework was employed to generate the network-based data representation (ie, patient-phenotypic-genetic network). Based on the Resource Description Framework data graph, Node2vec graph-embedding algorithm was applied to generate features. Multiple machine learning and deep learning backbone models were compared for cancer prediction performance.
RESULTS: With 6 machine learning tasks designed in the experiment, we demonstrated the proposed method achieved favorable results in classifying primary cancer types (area under the receiver operating characteristic curve [AUROC] 96.56% for all 9 cancer predictions on average based on the cross-validation) and predicting unknown primaries (AUROC 80.77% for all 8 cancer predictions on average for real-patient validation). To demonstrate the interpretability, 17 phenotypic and genetic features that contributed the most to the prediction of each cancer were identified and validated based on a literature review.
CONCLUSIONS: Accurate prediction of cancer types can be achieved with existing electronic health record data with satisfactory precision. The integration of genetic reports improves prediction, illustrating the translational values of incorporating genetic tests early at the diagnosis stage for patients with cancer.
PMID:34032581 | DOI:10.2196/23586
DeepGOWeb: fast and accurate protein function prediction on the (Semantic) Web
Nucleic Acids Res. 2021 May 21:gkab373. doi: 10.1093/nar/gkab373. Online ahead of print.
ABSTRACT
Understanding the functions of proteins is crucial to understand biological processes on a molecular level. Many more protein sequences are available than can be investigated experimentally. DeepGOPlus is a protein function prediction method based on deep learning and sequence similarity. DeepGOWeb makes the prediction model available through a website, an API, and through the SPARQL query language for interoperability with databases that rely on Semantic Web technologies. DeepGOWeb provides accurate and fast predictions and ensures that predicted functions are consistent with the Gene Ontology; it can provide predictions for any protein and any function in Gene Ontology. DeepGOWeb is freely available at https://deepgo.cbrc.kaust.edu.sa/.
PMID:34019664 | DOI:10.1093/nar/gkab373
Prototypes for automating product system model assembly
Int J Life Cycle Assess. 2021;26(3):483-496. doi: 10.1007/s11367-021-01870-9.
ABSTRACT
INTRODUCTION: The flexibility of life cycle inventory (LCI) background data selection is increasing with the increasing availability of data, but this comes along with the challenge of using the background data with primary life cycle inventory data. To relieve the burden on the practitioner to create the linkages and reduce bias, this study aimed at applying automation to create foreground LCI from primary data and link it to background data to construct product system models (PSM).
METHODS: Three experienced LCA software developers were commissioned to independently develop software prototypes to address the problem of how to generate an operable PSM from a complex product specification. The participants were given a confidential product specification in the form of a Bill of Materials (BOM) and were asked to develop and test prototype software under a limited time period that converted the BOM into a foreground model and linked it with one or more a background datasets, along with a list of other functional requirements. The resulting prototypes were compared and tested with additional product specifications.
RESULTS: Each developer took a distinct approach to the problem. One approach used semantic similarity relations to identify best-fit background datasets. Another approach focused on producing a flexible description of the model structure that removed redundancy and permitted aggregation. Another approach provided an interactive web application for matching product components to standardized product classification systems to facilitate characterization and linking.
DISCUSSION: Four distinct steps were identified in the broader problem of automating PSM construction: creating a foreground model from product data, determining the quantitative properties of foreground model flows, linking flows to background datasets, and expressing the linked model in a format that could be used by existing LCA software. The three prototypes are complementary in that they address different steps and demonstrate alternative approaches. Manual work was still required in each case, especially in the descriptions of the product flows that must be provided by background datasets.
CONCLUSION: The study demonstrates the utility of a distributed, comparative software development, as applied to the problem of LCA software. The results demonstrate that the problem of automated PSM construction is tractable. The prototypes created advance the state of the art for LCA software.
PMID:34017158 | PMC:PMC8128697 | DOI:10.1007/s11367-021-01870-9
Automatic semantic segmentation of breast tumors in ultrasound images based on combining fuzzy logic and deep learning-A feasibility study
PLoS One. 2021 May 20;16(5):e0251899. doi: 10.1371/journal.pone.0251899. eCollection 2021.
ABSTRACT
Computer aided diagnosis (CAD) of biomedical images assists physicians for a fast facilitated tissue characterization. A scheme based on combining fuzzy logic (FL) and deep learning (DL) for automatic semantic segmentation (SS) of tumors in breast ultrasound (BUS) images is proposed. The proposed scheme consists of two steps: the first is a FL based preprocessing, and the second is a Convolutional neural network (CNN) based SS. Eight well-known CNN based SS models have been utilized in the study. Studying the scheme was by a dataset of 400 cancerous BUS images and their corresponding 400 ground truth images. SS process has been applied in two modes: batch and one by one image processing. Three quantitative performance evaluation metrics have been utilized: global accuracy (GA), mean Jaccard Index (mean intersection over union (IoU)), and mean BF (Boundary F1) Score. In the batch processing mode: quantitative metrics' average results over the eight utilized CNNs based SS models over the 400 cancerous BUS images were: 95.45% GA instead of 86.08% without applying fuzzy preprocessing step, 78.70% mean IoU instead of 49.61%, and 68.08% mean BF score instead of 42.63%. Moreover, the resulted segmented images could show tumors' regions more accurate than with only CNN based SS. While, in one by one image processing mode: there has been no enhancement neither qualitatively nor quantitatively. So, only when a batch processing is needed, utilizing the proposed scheme may be helpful in enhancing automatic ss of tumors in BUS images. Otherwise applying the proposed approach on a one-by-one image mode will disrupt segmentation's efficiency. The proposed batch processing scheme may be generalized for an enhanced CNN based SS of a targeted region of interest (ROI) in any batch of digital images. A modified small dataset is available: https://www.kaggle.com/mohammedtgadallah/mt-small-dataset (S1 Data).
PMID:34014987 | PMC:PMC8136850 | DOI:10.1371/journal.pone.0251899