Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 162
Filtrar
Más filtros

Tipo del documento
Intervalo de año de publicación
1.
J Virol ; 98(3): e0173123, 2024 Mar 19.
Artículo en Inglés | MEDLINE | ID: mdl-38329345

RESUMEN

In our 2012 genome announcement (J Virol 86:11403-11404, 2012, https://doi.org/10.1128/JVI.01954-12), we initially identified the host bacterium of bacteriophage Enc34 as Enterobacter cancerogenus using biochemical tests. However, later in-house DNA sequencing revealed that the true host is a strain of Hafnia alvei. Capitalizing on our new DNA-sequencing capabilities, we also refined the genomic termini of Enc34, confirming a 60,496-bp genome with 12-nucleotide 5' cohesive ends. IMPORTANCE: Our correction reflects the evolving landscape of bacterial identification, where molecular methods have supplanted traditional biochemical tests. This case underscores the significance of revisiting past identifications, as seemingly known bacterial strains may yield unexpected discoveries, necessitating essential updates to the scientific record. Despite the host identity correction, our genome announcement retains importance as the first complete genome sequence of a Hafnia alvei bacteriophage.


Asunto(s)
Bacteriófagos , Hafnia alvei , Tropismo al Anfitrión , Bacteriófagos/clasificación , Bacteriófagos/genética , Bacteriófagos/aislamiento & purificación , Bacteriófagos/fisiología , Enterobacter/química , Enterobacter/virología , Genoma Viral/genética , Hafnia alvei/clasificación , Hafnia alvei/genética , Hafnia alvei/virología , Error Científico Experimental , Análisis de Secuencia de ADN
4.
J Virol ; 97(4): e0036523, 2023 04 27.
Artículo en Inglés | MEDLINE | ID: mdl-36897089

RESUMEN

When humans experience a new, devastating viral infection such as severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), significant challenges arise. How should individuals as well as societies respond to the situation? One of the primary questions concerns the origin of the SARS-CoV-2 virus that infected and was transmitted efficiently among humans, resulting in a pandemic. At first glance, the question appears straightforward to answer. However, the origin of SARS-CoV-2 has been the topic of substantial debate primarily because we do not have access to some relevant data. At least two major hypotheses have been suggested: a natural origin through zoonosis followed by sustained human-to-human spread or the introduction of a natural virus into humans from a laboratory source. Here, we summarize the scientific evidence that informs this debate to provide our fellow scientists and the public with the tools to join the discussion in a constructive and informed manner. Our goal is to dissect the evidence to make it more accessible to those interested in this important problem. The engagement of a broad representation of scientists is critical to ensure that the public and policy-makers can draw on relevant expertise in navigating this controversy.


Asunto(s)
COVID-19 , Pandemias , SARS-CoV-2 , Humanos , COVID-19/epidemiología , COVID-19/transmisión , COVID-19/virología , Laboratorios/normas , Investigación/normas , SARS-CoV-2/clasificación , SARS-CoV-2/genética , SARS-CoV-2/fisiología , Error Científico Experimental , Zoonosis Virales/transmisión , Zoonosis Virales/virología , Quirópteros/virología , Animales Salvajes/virología
7.
Nucleic Acids Res ; 49(2): e7, 2021 01 25.
Artículo en Inglés | MEDLINE | ID: mdl-32710622

RESUMEN

Traditional epitranscriptomics relies on capturing a single RNA modification by antibody or chemical treatment, combined with short-read sequencing to identify its transcriptomic location. This approach is labor-intensive and may introduce experimental artifacts. Direct sequencing of native RNA using Oxford Nanopore Technologies (ONT) can allow for directly detecting the RNA base modifications, although these modifications might appear as sequencing errors. The percent Error of Specific Bases (%ESB) was higher for native RNA than unmodified RNA, which enabled the detection of ribonucleotide modification sites. Based on the %ESB differences, we developed a bioinformatic tool, epitranscriptional landscape inferring from glitches of ONT signals (ELIGOS), that is based on various types of synthetic modified RNA and applied to rRNA and mRNA. ELIGOS is able to accurately predict known classes of RNA methylation sites (AUC > 0.93) in rRNAs from Escherichiacoli, yeast, and human cells, using either unmodified in vitro transcription RNA or a background error model, which mimics the systematic error of direct RNA sequencing as the reference. The well-known DRACH/RRACH motif was localized and identified, consistent with previous studies, using differential analysis of ELIGOS to study the impact of RNA m6A methyltransferase by comparing wild type and knockouts in yeast and mouse cells. Lastly, the DRACH motif could also be identified in the mRNA of three human cell lines. The mRNA modification identified by ELIGOS is at the level of individual base resolution. In summary, we have developed a bioinformatic software package to uncover native RNA modifications.


Asunto(s)
Biología Computacional/métodos , Secuenciación de Nucleótidos de Alto Rendimiento , Procesamiento Postranscripcional del ARN , RNA-Seq , Error Científico Experimental , Programas Informáticos , Adenina/análogos & derivados , Adenina/análisis , Animales , Línea Celular , Escherichia coli/genética , Humanos , Meiosis , Metiltransferasas/deficiencia , Metiltransferasas/metabolismo , Ratones , Ratones Noqueados , Motivos de Nucleótidos , ARN Bacteriano/genética , ARN de Hongos/genética , ARN Mensajero/genética , ARN Ribosómico/genética , Curva ROC , Saccharomyces cerevisiae/genética , Análisis de Secuencia de ADN , Moldes Genéticos , Transcripción Genética
9.
Am J Epidemiol ; 190(9): 1830-1840, 2021 09 01.
Artículo en Inglés | MEDLINE | ID: mdl-33517416

RESUMEN

Although variables are often measured with error, the impact of measurement error on machine-learning predictions is seldom quantified. The purpose of this study was to assess the impact of measurement error on the performance of random-forest models and variable importance. First, we assessed the impact of misclassification (i.e., measurement error of categorical variables) of predictors on random-forest model performance (e.g., accuracy, sensitivity) and variable importance (mean decrease in accuracy) using data from the National Comorbidity Survey Replication (2001-2003). Second, we created simulated data sets in which we knew the true model performance and variable importance measures and could verify that quantitative bias analysis was recovering the truth in misclassified versions of the data sets. Our findings showed that measurement error in the data used to construct random forests can distort model performance and variable importance measures and that bias analysis can recover the correct results. This study highlights the utility of applying quantitative bias analysis in machine learning to quantify the impact of measurement error on study results.


Asunto(s)
Sesgo , Error Científico Experimental/estadística & datos numéricos , Simulación por Computador , Conjuntos de Datos como Asunto , Humanos , Aprendizaje Automático/estadística & datos numéricos , Probabilidad , Intento de Suicidio/estadística & datos numéricos
10.
PLoS Biol ; 16(12): e2006776, 2018 12.
Artículo en Inglés | MEDLINE | ID: mdl-30571676

RESUMEN

Several organisms, including humans, display a deceleration in mortality rates at advanced ages. This mortality deceleration is sufficiently rapid to allow late-life mortality to plateau in old age in several species, causing the apparent cessation of biological ageing. Here, it is shown that late-life mortality deceleration (LLMD) and late-life plateaus are caused by common demographic errors. Age estimation and cohort blending errors introduced at rates below 1 in 10,000 are sufficient to cause LLMD and plateaus. In humans, observed error rates of birth and death registration predict the magnitude of LLMD. Correction for these sources of demographic error using a mixed linear model eliminates LLMD and late-life mortality plateaus (LLMPs) without recourse to biological or evolutionary models. These results suggest models developed to explain LLMD have been fitted to an error distribution, that ageing does not slow or stop during old age in humans, and that there is a finite limit to human longevity.


Asunto(s)
Envejecimiento/fisiología , Demografía/métodos , Mortalidad/tendencias , Animales , Evolución Biológica , Humanos , Modelos Lineales , Longevidad/fisiología , Error Científico Experimental
11.
Pharmacol Res ; 163: 105229, 2021 01.
Artículo en Inglés | MEDLINE | ID: mdl-33031909

RESUMEN

OBJECTIVES: Because observational studies often use imperfect measurements, results are prone to misclassification errors. We used as a motivating example the possible teratogenic risks of antiemetic agents in pregnancy since a large observational study recently showed that first-trimester exposure to doxylamine-pyridoxine was associated with significantly increased risk of congenital malformations as a whole, as well as central nervous system defects, and previous observational studies did not show such associations. A meta-analysis on this issue was carried out with the aim to illustrate how differential exposure and outcome misclassifications may lead to uncertain conclusions. METHODS: Medline, searched to October 2019 for full text papers in English. Summary Odds Ratios (ORs) with confidence intervals (CIs) were calculated using random-effect models. Probabilistic sensitivity analyses were performed for evaluating the extension of differential misclassification required to account for the exposure-outcome association. RESULTS: Summary ORs were 1.02 (95 % CI, 0.92-1.15), 0.99 (0.82-1.19) and 1.25 (1.08-1.44) for overall congenital, cardiocirculatory, and central nervous system malformations respectively. By assuming exposure and outcome bias factor respectively of 0.95 (i.e., newborns with congenital defects had exposure specificity 5% lower than healthy newborns) and 1.12 (i.e., exposed newborns had outcome sensitivity 12 % higher than unexposed newborns), summary OR of central nervous system defects became 1.13 (95 % CI, 0.99-1.29) and 1.17 (95 % CI, 0.99-1.38). CONCLUSION: Observational investigations and meta-analyses of observational studies need cautious interpretations. Their susceptibility to several, often sneaky, sources of bias should be carefully evaluated.


Asunto(s)
Anomalías Inducidas por Medicamentos/epidemiología , Antieméticos/efectos adversos , Diciclomina/efectos adversos , Doxilamina/efectos adversos , Náusea/tratamiento farmacológico , Piridoxina/efectos adversos , Vómitos/tratamiento farmacológico , Combinación de Medicamentos , Femenino , Humanos , Náusea/epidemiología , Estudios Observacionales como Asunto , Oportunidad Relativa , Embarazo , Error Científico Experimental , Incertidumbre , Vómitos/epidemiología
12.
EMBO Rep ; 20(12): e49482, 2019 12 05.
Artículo en Inglés | MEDLINE | ID: mdl-31680386

RESUMEN

Old data are like yesterday's leftovers: sapped of novelty and excitement. But revisiting old sequence data with a fresh mind and new techniques can yield new and unexpected results.


Asunto(s)
Genoma de Planta , Genómica , Arabidopsis/genética , Chlamydomonas reinhardtii/genética , ADN Mitocondrial/genética , ADN de Plantas/genética , Genoma de Plastidios , Genómica/normas , Orgánulos/genética , Edición/normas , Error Científico Experimental
13.
Methods ; 174: 27-41, 2020 03 01.
Artículo en Inglés | MEDLINE | ID: mdl-31344404

RESUMEN

Super-resolution fluorescence microscopy has become an important catalyst for discovery in the life sciences. In STimulated Emission Depletion (STED) microscopy, a pattern of light drives fluorophores from a signal-emitting on-state to a non-signalling off-state. Only emitters residing in a sub-diffraction volume around an intensity minimum are allowed to fluoresce, rendering them distinguishable from the nearby, but dark fluorophores. STED routinely achieves resolution in the few tens of nanometers range in biological samples and is suitable for live imaging. Here, we review the working principle of STED and provide general guidelines for successful STED imaging. The strive for ever higher resolution comes at the cost of increased light burden. We discuss techniques to reduce light exposure and mitigate its detrimental effects on the specimen. These include specialized illumination strategies as well as protecting fluorophores from photobleaching mediated by high-intensity STED light. This opens up the prospect of volumetric imaging in living cells and tissues with diffraction-unlimited resolution in all three spatial dimensions.


Asunto(s)
Procesamiento de Imagen Asistido por Computador/métodos , Microscopía Fluorescente/métodos , Color , Equipo Reutilizado , Fluorescencia , Colorantes Fluorescentes/química , Colorantes Fluorescentes/efectos de la radiación , Iluminación/métodos , Imagen Óptica/métodos , Fotoblanqueo , Error Científico Experimental , Factores de Tiempo
14.
Environ Health ; 20(1): 94, 2021 08 24.
Artículo en Inglés | MEDLINE | ID: mdl-34429109

RESUMEN

BACKGROUND: Most epidemiological studies estimate associations without considering exposure measurement error. While some studies have estimated the impact of error in single-exposure models we aimed to quantify the effect of measurement error in multi-exposure models, specifically in time-series analysis of PM2.5, NO2, and mortality using simulations, under various plausible scenarios for exposure errors. Measurement error in multi-exposure models can lead to effect transfer where the effect estimate is overestimated for the pollutant estimated with more error to the one estimated with less error. This complicates interpretation of the independent effects of different pollutants and thus the relative importance of reducing their concentrations in air pollution policy. METHODS: Measurement error was defined as the difference between ambient concentrations and personal exposure from outdoor sources. Simulation inputs for error magnitude and variability were informed by the literature. Error-free exposures with their consequent health outcome and error-prone exposures of various error types (classical/Berkson) were generated. Bias was quantified as the relative difference in effect estimates of the error-free and error-prone exposures. RESULTS: Mortality effect estimates were generally underestimated with greater bias observed when low ratios of the true exposure variance over the error variance were assumed (27.4% underestimation for NO2). Higher ratios resulted in smaller, but still substantial bias (up to 19% for both pollutants). Effect transfer was observed indicating that less precise measurements for one pollutant (NO2) yield more bias, while the co-pollutant (PM2.5) associations were found closer to the true. Interestingly, the sum of single-pollutant model effect estimates was found closer to the summed true associations than those from multi-pollutant models, due to cancelling out of confounding and measurement error bias. CONCLUSIONS: Our simulation study indicated an underestimation of true independent health effects of multiple exposures due to measurement error. Using error parameter information in future epidemiological studies should provide more accurate concentration-response functions.


Asunto(s)
Contaminación del Aire/efectos adversos , Exposición a Riesgos Ambientales/efectos adversos , Modelos Teóricos , Mortalidad , Error Científico Experimental , Contaminantes Atmosféricos/efectos adversos , Contaminantes Atmosféricos/análisis , Contaminación del Aire/análisis , Sesgo , Simulación por Computador , Exposición a Riesgos Ambientales/análisis , Humanos , Dióxido de Nitrógeno/efectos adversos , Dióxido de Nitrógeno/análisis , Material Particulado/efectos adversos , Material Particulado/análisis
15.
Nucleic Acids Res ; 47(21): 10994-11006, 2019 12 02.
Artículo en Inglés | MEDLINE | ID: mdl-31584084

RESUMEN

The widespread occurrence of repetitive stretches of DNA in genomes of organisms across the tree of life imposes fundamental challenges for sequencing, genome assembly, and automated annotation of genes and proteins. This multi-level problem can lead to errors in genome and protein databases that are often not recognized or acknowledged. As a consequence, end users working with sequences with repetitive regions are faced with 'ready-to-use' deposited data whose trustworthiness is difficult to determine, let alone to quantify. Here, we provide a review of the problems associated with tandem repeat sequences that originate from different stages during the sequencing-assembly-annotation-deposition workflow, and that may proliferate in public database repositories affecting all downstream analyses. As a case study, we provide examples of the Atlantic cod genome, whose sequencing and assembly were hindered by a particularly high prevalence of tandem repeats. We complement this case study with examples from other species, where mis-annotations and sequencing errors have propagated into protein databases. With this review, we aim to raise the awareness level within the community of database users, and alert scientists working in the underlying workflow of database creation that the data they omit or improperly assemble may well contain important biological information valuable to others.


Asunto(s)
ADN/genética , Bases de Datos de Ácidos Nucleicos , Bases de Datos de Proteínas , Error Científico Experimental , Secuencias Repetidas en Tándem/genética , Animales , Gadus morhua/genética , Análisis de Secuencia de ADN
16.
Nucleic Acids Res ; 47(15): e87, 2019 09 05.
Artículo en Inglés | MEDLINE | ID: mdl-31127310

RESUMEN

Detection of cancer-associated somatic mutations has broad applications for oncology and precision medicine. However, this becomes challenging when cancer-derived DNA is in low abundance, such as in impure tissue specimens or in circulating cell-free DNA. Next-generation sequencing (NGS) is particularly prone to technical artefacts that can limit the accuracy for calling low-allele-frequency mutations. State-of-the-art methods to improve detection of low-frequency mutations often employ unique molecular identifiers (UMIs) for error suppression; however, these methods are highly inefficient as they depend on redundant sequencing to assemble consensus sequences. Here, we present a novel strategy to enhance the efficiency of UMI-based error suppression by retaining single reads (singletons) that can participate in consensus assembly. This 'Singleton Correction' methodology outperformed other UMI-based strategies in efficiency, leading to greater sensitivity with high specificity in a cell line dilution series. Significant benefits were seen with Singleton Correction at sequencing depths ≤16 000×. We validated the utility and generalizability of this approach in a cohort of >300 individuals whose peripheral blood DNA was subjected to hybrid capture sequencing at ∼5000× depth. Singleton Correction can be incorporated into existing UMI-based error suppression workflows to boost mutation detection accuracy, thus improving the cost-effectiveness and clinical impact of NGS.


Asunto(s)
Código de Barras del ADN Taxonómico/métodos , Leucemia Mieloide Aguda/genética , Mutación , Proteínas de Neoplasias/genética , Análisis de Secuencia de ADN/métodos , Alelos , Línea Celular Tumoral , Sangre Fetal/citología , Sangre Fetal/metabolismo , Frecuencia de los Genes , Células HCT116 , Secuenciación de Nucleótidos de Alto Rendimiento , Humanos , Leucemia Mieloide Aguda/patología , Leucocitos Mononucleares/metabolismo , Leucocitos Mononucleares/patología , Medicina de Precisión/métodos , Error Científico Experimental
17.
Proc Natl Acad Sci U S A ; 115(11): 2563-2570, 2018 03 13.
Artículo en Inglés | MEDLINE | ID: mdl-29531079

RESUMEN

Some aspects of science, taken at the broadest level, are universal in empirical research. These include collecting, analyzing, and reporting data. In each of these aspects, errors can and do occur. In this work, we first discuss the importance of focusing on statistical and data errors to continually improve the practice of science. We then describe underlying themes of the types of errors and postulate contributing factors. To do so, we describe a case series of relatively severe data and statistical errors coupled with surveys of some types of errors to better characterize the magnitude, frequency, and trends. Having examined these errors, we then discuss the consequences of specific errors or classes of errors. Finally, given the extracted themes, we discuss methodological, cultural, and system-level approaches to reducing the frequency of commonly observed errors. These approaches will plausibly contribute to the self-critical, self-correcting, ever-evolving practice of science, and ultimately to furthering knowledge.


Asunto(s)
Recolección de Datos , Proyectos de Investigación , Error Científico Experimental , Estadística como Asunto/normas , Recolección de Datos/normas , Recolección de Datos/estadística & datos numéricos , Humanos , Control de Calidad , Reproducibilidad de los Resultados , Proyectos de Investigación/normas , Proyectos de Investigación/estadística & datos numéricos , Ciencia/normas , Ciencia/estadística & datos numéricos
18.
J Chem Inf Model ; 60(4): 1969-1982, 2020 04 27.
Artículo en Inglés | MEDLINE | ID: mdl-32207612

RESUMEN

Given a particular descriptor/method combination, some quantitative structure-activity relationship (QSAR) datasets are very predictive by random-split cross-validation while others are not. Recent literature in modelability suggests that the limiting issue for predictivity is in the data, not the QSAR methodology, and the limits are due to activity cliffs. Here, we investigate, on in-house data, the relative usefulness of experimental error, distribution of the activities, and activity cliff metrics in determining how predictive a dataset is likely to be. We include unmodified in-house datasets, datasets that should be perfectly predictive based only on the chemical structure, datasets where the distribution of activities is manipulated, and datasets that include a known amount of added noise. We find that activity cliff metrics determine predictivity better than the other metrics we investigated, whatever the type of dataset, consistent with the modelability literature. However, such metrics cannot distinguish real activity cliffs due to large uncertainties in the activities. We also show that a number of modern QSAR methods, and some alternative descriptors, are equally bad at predicting the activities of compounds on activity cliffs, consistent with the assumptions behind "modelability." Finally, we relate time-split predictivity with random-split predictivity and show that different coverages of chemical space are at least as important as uncertainty in activity and/or activity cliffs in limiting predictivity.


Asunto(s)
Relación Estructura-Actividad Cuantitativa , Error Científico Experimental , Relación Estructura-Actividad , Incertidumbre
19.
Clin Chem Lab Med ; 58(7): 1070-1076, 2020 06 25.
Artículo en Inglés | MEDLINE | ID: mdl-32172228

RESUMEN

A novel zoonotic coronavirus outbreak is spreading all over the world. This pandemic disease has now been defined as novel coronavirus disease 2019 (COVID-19), and is sustained by severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2). As the current gold standard for the etiological diagnosis of SARS-CoV-2 infection is (real time) reverse transcription polymerase chain reaction (rRT-PCR) on respiratory tract specimens, the diagnostic accuracy of this technique shall be considered a foremost prerequisite. Overall, potential RT-PCR vulnerabilities include general preanalytical issues such as identification problems, inadequate procedures for collection, handling, transport and storage of the swabs, collection of inappropriate or inadequate material (for quality or volume), presence of interfering substances, manual errors, as well as specific aspects such as sample contamination and testing patients receiving antiretroviral therapy. Some analytical problems may also contribute to jeopardize the diagnostic accuracy, including testing outside the diagnostic window, active viral recombination, use of inadequately validated assays, insufficient harmonization, instrument malfunctioning, along with other specific technical issues. Some practical indications can hence be identified for minimizing the risk of diagnostic errors, encompassing the improvement of diagnostic accuracy by combining clinical evidence with results of chest computed tomography (CT) and RT-PCR, interpretation of RT-PCR results according to epidemiologic, clinical and radiological factors, recollection and testing of upper (or lower) respiratory specimens in patients with negative RT-PCR test results and high suspicion or probability of infection, dissemination of clear instructions for specimen (especially swab) collection, management and storage, together with refinement of molecular target(s) and thorough compliance with analytical procedures, including quality assurance.


Asunto(s)
Infecciones por Coronavirus/diagnóstico , Infecciones por Coronavirus/economía , Errores Médicos/tendencias , Pandemias/economía , Neumonía Viral/diagnóstico , Neumonía Viral/economía , Error Científico Experimental/tendencias , Betacoronavirus/patogenicidad , COVID-19 , Técnicas de Laboratorio Clínico/economía , Técnicas de Laboratorio Clínico/normas , Coronavirus/patogenicidad , Brotes de Enfermedades/economía , Humanos , SARS-CoV-2 , Manejo de Especímenes/economía , Manejo de Especímenes/métodos
20.
J Immunol ; 201(12): 3694-3704, 2018 12 15.
Artículo en Inglés | MEDLINE | ID: mdl-30397033

RESUMEN

Next-generation sequencing of the Ig gene repertoire (Ig-seq) produces large volumes of information at the nucleotide sequence level. Such data have improved our understanding of immune systems across numerous species and have already been successfully applied in vaccine development and drug discovery. However, the high-throughput nature of Ig-seq means that it is afflicted by high error rates. This has led to the development of error-correction approaches. Computational error-correction methods use sequence information alone, primarily designating sequences as likely to be correct if they are observed frequently. In this work, we describe an orthogonal method for filtering Ig-seq data, which considers the structural viability of each sequence. A typical natural Ab structure requires the presence of a disulfide bridge within each of its variable chains to maintain the fold. Our Ab Sequence Selector (ABOSS) uses the presence/absence of this bridge as a way of both identifying structurally viable sequences and estimating the sequencing error rate. On simulated Ig-seq datasets, ABOSS is able to identify more than 99% of structurally viable sequences. Applying our method to six independent Ig-seq datasets (one mouse and five human), we show that our error calculations are in line with previous experimental and computational error estimates. We also show how ABOSS is able to identify structurally impossible sequences missed by other error-correction methods.


Asunto(s)
Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Inmunoglobulinas/genética , Programas Informáticos , Vacunas/inmunología , Algoritmos , Animales , Biología Computacional , Bases de Datos como Asunto , Desarrollo de Medicamentos , Humanos , Ratones , Conformación Proteica , Control de Calidad , Error Científico Experimental , Relación Estructura-Actividad
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA