Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 8 de 8
Filtrar
1.
Sensors (Basel) ; 20(23)2020 Nov 29.
Artigo em Inglês | MEDLINE | ID: mdl-33260462

RESUMO

A preliminary analysis of Galileo F/NAV broadcast Clock and Ephemeris is performed in this paper with 43 months of data. Using consolidated Galileo Receiver Independent Exchange (RINEX) navigation files, automated navigation data monitoring is applied from 1 January 2017 to 31 July 2020 to detect and verify potential faults in the satellite broadcast navigation data. Based on these observation results, the Galileo Signal-in-Space is assessed, and the probability of satellite failure is estimated. The Galileo nominal ranging accuracy is also characterized. Results for GPS satellites are included in the paper to compare Galileo performances with a consolidated constellation. Although this study is limited by the short observation period available, the analysis over the last three-year window shows promising results with Psat = 3.2 × 10-6/sat, which is below the value of 1 × 10-5 stated by the Galileo commitments.

2.
BMC Bioinformatics ; 20(1): 216, 2019 Apr 29.
Artigo em Inglês | MEDLINE | ID: mdl-31035936

RESUMO

BACKGROUND: The large biological databases such as GenBank contain vast numbers of records, the content of which is substantively based on external resources, including published literature. Manual curation is used to establish whether the literature and the records are indeed consistent. We explore in this paper an automated method for assessing the consistency of biological assertions, to assist biocurators, which we call BARC, Biocuration tool for Assessment of Relation Consistency. In this method a biological assertion is represented as a relation between two objects (for example, a gene and a disease); we then use our novel set-based relevance algorithm SaBRA to retrieve pertinent literature, and apply a classifier to estimate the likelihood that this relation (assertion) is correct. RESULTS: Our experiments on assessing gene-disease relations and protein-protein interactions using the PubMed Central collection show that BARC can be effective at assisting curators to perform data cleansing. Specifically, the results obtained showed that BARC substantially outperforms the best baselines, with an improvement of F-measure of 3.5% and 13%, respectively, on gene-disease relations and protein-protein interactions. We have additionally carried out a feature analysis that showed that all feature types are informative, as are all fields of the documents. CONCLUSIONS: BARC provides a clear benefit for the biocuration community, as there are no prior automated tools for identifying inconsistent assertions in large-scale biological databases.


Assuntos
Algoritmos , Mineração de Dados/métodos , Bases de Dados Factuais , Bases de Dados de Ácidos Nucleicos , Humanos , Mapas de Interação de Proteínas , Editoração
3.
Adv Sci (Weinh) ; 11(17): e2308652, 2024 May.
Artigo em Inglês | MEDLINE | ID: mdl-38386329

RESUMO

Non-fullerene acceptors (NFAs) have recently emerged as pivotal materials for enhancing the efficiency of organic solar cells (OSCs). To further advance OSC efficiency, precise control over the energy levels of NFAs is imperative, necessitating the development of a robust computational method for accurate energy level predictions. Unfortunately, conventional computational techniques often yield relatively large errors, typically ranging from 0.2 to 0.5 electronvolts (eV), when predicting energy levels. In this study, the authors present a novel method that not only expedites energy level predictions but also significantly improves accuracy , reducing the error margin to 0.06 eV. The method comprises two essential components. The first component involves data cleansing, which systematically eliminates problematic experimental data and thereby minimizes input data errors. The second component introduces a molecular description method based on the electronic properties of the sub-units comprising NFAs. The approach simplifies the intricacies of molecular computation and demonstrates markedly enhanced prediction performance compared to the conventional density functional theory (DFT) method. Our methodology will expedite research in the field of NFAs, serving as a catalyst for the development of similar computational approaches to address challenges in other areas of material science and molecular research.

4.
MethodsX ; 9: 101850, 2022.
Artigo em Inglês | MEDLINE | ID: mdl-36164434

RESUMO

As one of the law resources of Muslim society, hadith is very important to learn. Unlike most hadith-related research, which studies more about content, we examine the relationship pattern between hadith narrators. In the study of hadith science, a series of hadith narrators who narrate a hadith is referred to as a sanad. This hadith sanad must be connected to the Prophet as the primary source of a hadith. Therefore, research related to the relationship between narrators is fundamental because it affects the quality and validity of a hadith. This paper analyzes the pattern of hadith narrators using Sequential Pattern Discovery using Equivalence Classes (SPADE). We separate the data of the narrators from the content, whereas, in the hadith books we use, the two are still mixed. This study, therefore, provides detailed information on the steps in the analysis of the patterns of hadith narrators. Some of the highlights of this paper are:•Algorithm 1 provides the detailed steps in data preprocessing to obtain the "clean data" needed in analyzing the pattern of narrator relationships.•Algorithm 2 provides a detailed description of analyzing the pattern between hadith narrators using SPADE.

5.
Artigo em Inglês | MEDLINE | ID: mdl-32771180

RESUMO

BACKGROUND: Mental health diagnostic approaches are seeking to identify biological markers to work alongside advanced machine learning approaches. It is difficult to identify a biological marker of disease when the traditional diagnostic labels themselves are not necessarily valid. METHODS: We worked with T1 structural magnetic resonance imaging data collected from 1493 individuals comprising healthy control subjects, patients with psychosis, and their unaffected first-degree relatives. Specifically, the dataset included 176 bipolar disorder probands, 134 schizoaffective disorder probands, 240 schizophrenia probands, 362 control subjects, and 581 patient relatives. We assumed that there might be noise in the diagnostic labeling process. We detected label noise by classifying the data multiple times using a support vector machine classifier, and then we flagged those individuals in which all classifiers unanimously mislabeled those subjects. Next, we assigned a new diagnostic label to these individuals, based on the biological data (magnetic resonance imaging), using an iterative data cleansing approach. RESULTS: Simulation results showed that our method was highly accurate in identifying label noise. Both diagnostic and biotype categories showed about 65% and 63% of noisy labels, respectively, with the largest amount of relabeling occurring between the healthy control subjects and individuals with bipolar disorder and schizophrenia as well as in unaffected close relatives. The extraction of imaging features highlighted regional brain changes associated with each group. CONCLUSIONS: This approach represents an initial step toward developing strategies that need not assume that existing mental health diagnostic categories are always valid but rather allows us to leverage this information while also acknowledging that there are misassignments.


Assuntos
Transtorno Bipolar , Transtornos Psicóticos , Esquizofrenia , Transtorno Bipolar/diagnóstico por imagem , Humanos , Imageamento por Ressonância Magnética , Saúde Mental , Transtornos Psicóticos/diagnóstico , Esquizofrenia/diagnóstico por imagem
6.
J Med Imaging Radiat Oncol ; 63(4): 517-529, 2019 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-31081603

RESUMO

INTRODUCTION: This paper provides a demonstration of how non-curated data can be retrospectively cleaned, so that existing repositories of radiotherapy treatment planning data can be used to complete bulk retrospective analyses of dosimetric trends and other plan characteristics. METHODS: A non curated archive of 1137 radiotherapy treatment plans accumulated over a 12-month period, from five radiotherapy centres operated by one institution, was used to investigate and demonstrate a process of clinical data cleansing, by: identifying and translating inconsistent structure names; correcting inconsistent lung contouring; excluding plans for treatments other than breast tangents and plans without identifiable PTV, lung and heart structures; and identifying but not excluding plans that deviated from the local planning protocol. PTV, heart and lung dose-volume metrics were evaluated, in addition to a sample of personnel and linac load indicators. RESULTS: Data cleansing reduced the number of treatment plans in the sample by 35.7%. Inconsistent structure names were successfully identified and translated (e.g. 35 different names for lung). Automatically separating whole lung structures into left and right lung structures allowed the effect of contralateral and ipsilateral lung dose to be evaluated, while introducing some small uncertainties, compared to manual contouring. PTV doses were indicative of prescription doses. Breast treatment work was unevenly distributed between oncologists and between metropolitan and regional centres. CONCLUSION: This paper exemplifies the data cleansing and data analysis steps that may be completed using existing treatment planning data, to provide individual radiation oncology departments with access to information on their own patient populations. Clearly, the well-planned and systematic recording of new, high quality data is the preferred solution, but the retrospective curation of non-curated data may be a useful interim solution, for radiation oncology departments where the systems for recording of new data have yet to be designed and agreed.


Assuntos
Neoplasias da Mama/radioterapia , Planejamento da Radioterapia Assistida por Computador/métodos , Mama/diagnóstico por imagem , Neoplasias da Mama/diagnóstico por imagem , Feminino , Humanos , Pulmão/diagnóstico por imagem , Órgãos em Risco/diagnóstico por imagem , Dosagem Radioterapêutica , Estudos Retrospectivos , Tomografia Computadorizada por Raios X/métodos
7.
Stud Health Technol Inform ; 248: 116-123, 2018.
Artigo em Inglês | MEDLINE | ID: mdl-29726427

RESUMO

BACKGROUND: A challenge of using electronic health records for secondary analyses is data quality. Body mass index (BMI) is an important predictor for various diseases but often not documented properly. OBJECTIVES: The aim of our study is to perform data cleansing on BMI values and to find the best method for an imputation of missing values in order to increase data quality. Further, we want to assess the effect of changes in data quality on the performance of a prediction model based on machine learning. METHODS: After data cleansing on BMI data, we compared machine learning methods and statistical methods in their accuracy of imputed values using the root mean square error. In a second step, we used three variations of BMI data as a training set for a model predicting the occurrence of delirium. RESULTS: Neural network and linear regression models performed best for imputation. There were no changes in model performance for different BMI input data. CONCLUSION: Although data quality issues may lead to biases, it does not always affect performance of secondary analyses.


Assuntos
Índice de Massa Corporal , Aprendizado de Máquina , Humanos , Modelos Lineares , Redes Neurais de Computação
8.
BioData Min ; 7: 3, 2014.
Artigo em Inglês | MEDLINE | ID: mdl-24872843

RESUMO

BACKGROUND: In silco Biology is increasingly important and is often based on public data. While the problem of contamination is well recognised in microbiology labs the corresponding problem of database corruption has received less attention. RESULTS: Mapping 50 billion next generation DNA sequences from The Thousand Genome Project against published genomes reveals many that match one or more Mycoplasma but are not included in the reference human genome GRCh37.p5. Many of these are of low quality but NCBI BLAST searches confirm some high quality, high entropy sequences match Mycoplasma but no human sequences. CONCLUSIONS: It appears at least 7% of 1000G samples are contaminated.

SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA