Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Resultados 1 - 19 de 19
Filtrar
1.
Euro Surveill ; 28(36)2023 09.
Artículo en Inglés | MEDLINE | ID: mdl-37676147

RESUMEN

We describe 10 cases of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) variant BA.2.86 detected in Denmark, including molecular characteristics and results from wastewater surveillance that indicate that the variant is circulating in the country at a low level. This new variant with many spike gene mutations was classified as a variant under monitoring by the World Health Organization on 17 August 2023. Further global monitoring of COVID-19, BA.2.86 and other SARS-CoV-2 variants is highly warranted.


Asunto(s)
COVID-19 , Humanos , SARS-CoV-2/genética , Aguas Residuales , Monitoreo Epidemiológico Basado en Aguas Residuales , Dinamarca/epidemiología
2.
Breast Cancer Res ; 17: 102, 2015 Aug 05.
Artículo en Inglés | MEDLINE | ID: mdl-26242876

RESUMEN

INTRODUCTION: By convention, a contralateral breast cancer (CBC) is treated as a new primary tumor, independent of the first cancer (BC1). Although there have been indications that the second tumor (BC2) sometimes may represent a metastatic spread of BC1, this has never been conclusively shown. We sought to apply next-generation sequencing to determine a "genetic barcode" for each tumor and reveal the clonal relationship of CBCs. METHODS: Ten CBC patients with detailed clinical information and available fresh frozen tumor tissue were studied. Using low-coverage whole genome DNA-sequencing data for each tumor, chromosomal rearrangements were enumerated and copy number profiles were generated. Comparisons between tumors provided an estimate of clonal relatedness for tumor pairs within individual patients. RESULTS: Between 15-256 rearrangements were detected in each tumor (median 87). For one patient, 76 % (68 out of 90) of the rearrangements were shared between BC1 and BC2, highly consistent with what has been seen for true primary-metastasis pairs (>50 %) and thus confirming a common clonal origin of the two tumors. For most of the remaining cases, BC1 and BC2 had similarly low overlap as unmatched randomized pairs of tumors from different individuals, suggesting the CBC to represent a new independent primary tumor. CONCLUSION: Using rearrangement fingerprinting, we show for the first time with certainty that a contralateral BC2 can represent a metastatic spread of BC1. Given the poor prognosis of a generalized disease compared to a new primary tumor, these women need to be identified at diagnosis of CBC for appropriate determination of treatment. Our approach generates a promising new method to assess clonal relationship between tumors. Additional studies are required to confirm the frequency of CBCs representing metastatic events.


Asunto(s)
Neoplasias de la Mama/genética , Neoplasias de la Mama/patología , Metástasis de la Neoplasia/genética , Metástasis de la Neoplasia/patología , Neoplasias Primarias Secundarias/genética , Neoplasias Primarias Secundarias/patología , Adulto , Anciano de 80 o más Años , Femenino , Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Humanos , Persona de Mediana Edad
3.
Heliyon ; 10(9): e29703, 2024 May 15.
Artículo en Inglés | MEDLINE | ID: mdl-38694057

RESUMEN

Wastewater sequencing has become a powerful supplement to clinical testing in monitoring SARS-CoV-2 infections in the post-COVID-19 pandemic era. While its applications in measuring the viral burden and main circulating lineages in the community have proved their efficacy, the variations in sequencing quality and coverage across the different regions of the SARS-CoV-2 genome are not well understood. Furthermore, it is unclear how different sample origins, viral extraction and concentration methods and environmental factors impact the reads sequenced from wastewater. Using high-coverage, amplicon-based, paired-end read sequencing of viral RNA extracted from wastewater collected directly from aircraft, pooled from different aircraft and airport buildings or from regular wastewater plants, we assessed the genome coverage across the sample groups with a focus on the 5'-end region covering the leader sequence and investigated whether it was possible to detect subgenomic RNA from viral material recovered from wastewater. We identified distinct patterns in the persistence of the different genomic regions across the different types of wastewaters and the existence of chimeric reads mapping to non-amplified regions. Our findings suggest that preservation of the 5'-end of the genome and the ability to detect subgenomic RNA reads, though highly susceptible to environment and sample processing conditions, may be indicative of the quality and amount of the viral RNA present in wastewater.

4.
EBioMedicine ; 93: 104669, 2023 Jul.
Artículo en Inglés | MEDLINE | ID: mdl-37348163

RESUMEN

BACKGROUND: Severe acute respiratory syndrome coronavirus-2 (SARS-CoV-2) has rapidly spread worldwide in the population since it was first detected in late 2019. The transcription and replication of coronaviruses, although not fully understood, is characterised by the production of genomic length RNA and shorter subgenomic RNAs to make viral proteins and ultimately progeny virions. Observed levels of subgenomic RNAs differ between sub-lineages and open reading frames but their biological significance is presently unclear. METHODS: Using a large and diverse panel of virus sequencing data produced as part of the Danish COVID-19 routine surveillance together with information in electronic health registries, we assessed the association of subgenomic RNA levels with demographic and clinical variables of the infected individuals. FINDINGS: Our findings suggest no significant statistical relationship between levels of subgenomic RNAs and host-related factors. INTERPRETATION: Differences between lineages and subgenomic ORFs may be related to differences in target cell tropism, early virus replication/transcription kinetics or sequence features. FUNDING: The author(s) received no specific funding for this work.


Asunto(s)
COVID-19 , SARS-CoV-2 , Humanos , SARS-CoV-2/genética , ARN Subgenómico , Genómica , Dinamarca/epidemiología
5.
Microorganisms ; 11(10)2023 Oct 04.
Artículo en Inglés | MEDLINE | ID: mdl-37894148

RESUMEN

The emergence of antibiotic resistance is a global health concern. Therefore, understanding the mechanisms of its spread is crucial for implementing evidence-based strategies to tackle resistance in the context of the One Health approach. In developing countries where sanitation systems and access to clean and safe water are still major challenges, contamination may introduce bacteria and bacteriophages harboring antibiotic resistance genes (ARGs) into the environment. This contamination can increase the risk of exposure and community transmission of ARGs and infectious pathogens. However, there is a paucity of information on the mechanisms of bacteriophage-mediated spread of ARGs and patterns through the environment. Here, we deploy Droplet Digital PCR (ddPCR) and metagenomics approaches to analyze the abundance of ARGs and bacterial pathogens disseminated through clean and wastewater systems. We detected a relatively less-studied and rare human zoonotic pathogen, Vibrio metschnikovii, known to spread through fecal--oral contamination, similarly to V. cholerae. Several antibiotic resistance genes were identified in both bacterial and bacteriophage fractions from water sources. Using metagenomics, we detected several resistance genes related to tetracyclines and beta-lactams in all the samples. Environmental samples from outlet wastewater had a high diversity of ARGs and contained high levels of blaOXA-48. Other identified resistance profiles included tetA, tetM, and blaCTX-M9. Specifically, we demonstrated that blaCTX-M1 is enriched in the bacteriophage fraction from wastewater. In general, however, the bacterial community has a significantly higher abundance of resistance genes compared to the bacteriophage population. In conclusion, the study highlights the need to implement environmental monitoring of clean and wastewater to inform the risk of infectious disease outbreaks and the spread of antibiotic resistance in the context of One Health.

6.
Sci Rep ; 13(1): 23039, 2023 12 27.
Artículo en Inglés | MEDLINE | ID: mdl-38155185

RESUMEN

Citrullinated vimentin has been linked to several chronic and autoimmune diseases, but how citrullinated vimentin is associated with disease prevalence and genetic variants in a clinical setting remains unknown. The aim of this study was to obtain a better understanding of the genetic variants and pathologies associated with citrullinated and MMP-degraded vimentin. Patient Registry data, serum samples and genotypes were collected for a total of 4369 Danish post-menopausal women enrolled in the Prospective Epidemiologic and Risk Factor study (PERF). Circulating citrullinated and MMP-degraded vimentin (VICM) was measured. Genome-wide association studies (GWAS) and phenome wide association studies (PheWAS) with levels of VICM were performed. High levels of VICM were significantly associated with the prevalence of chronic pulmonary diseases and death from respiratory and cardiovascular diseases (CVD). GWAS identified 33 single nucleotide polymorphisms (SNPs) with a significant association with VICM. These variants were in the peptidylarginine deiminase 3/4 (PADI3/PADI4) and Complement Factor H (CFH)/KCNT2 gene loci on chromosome 1. Serum levels of VICM, a marker of citrullinated and MMP-degraded vimentin, were associated with chronic pulmonary diseases and genetic variance in PADI3/PADI4 and CFH/ KCNT2. This points to the potential for VICM to be used as an activity marker of both citrullination and inflammation, identifying responders to targeted treatment and patients likely to experience disease progression.


Asunto(s)
Estudio de Asociación del Genoma Completo , Enfermedades Pulmonares , Humanos , Femenino , Desiminasas de la Arginina Proteica/genética , Vimentina/genética , Estudios Prospectivos , Posmenopausia/genética , Enfermedades Pulmonares/genética , Hidrolasas/genética , Canales de potasio activados por Sodio/genética , Arginina Deiminasa Proteína-Tipo 3
7.
Front Microbiol ; 13: 1049110, 2022.
Artículo en Inglés | MEDLINE | ID: mdl-36425042

RESUMEN

Spread of antibiotic resistance is a significant challenge for our modern health care system, and even more so in developing countries with higher prevalence of both infections and resistant bacteria. Faulty usage of antibiotics has been pinpointed as a driving factor in spread of resistant bacteria through selective pressure. However, horizontal gene transfer mediated through bacteriophages may also play an important role in this spread. In a cohort of Tanzanian patients suffering from bacterial infections, we demonstrate significant differences in the oral microbial diversity between infected and non-infected individuals, as well as before and after oral antibiotics treatment. Further, the resistome carried both by bacteria and bacteriophages vary significantly, with bla CTX-M1 resistance genes being mobilized and enriched within phage populations. This may impact how we consider spread of resistance in a biological context, as well in terms of treatment regimes.

8.
Virus Evol ; 7(2): veab055, 2021.
Artículo en Inglés | MEDLINE | ID: mdl-34532059

RESUMEN

Understanding of pandemics depends on the characterization of pathogen collections from well-defined and demographically diverse cohorts. Since its emergence in Congo almost a century ago, Human Immunodeficiency Virus Type 1 (HIV-1) has geographically spread and genetically diversified into distinct viral subtypes. Phylogenetic analysis can be used to reconstruct the ancestry of the virus to better understand the origin and distribution of subtypes. We sequenced two 3.6-kb amplicons of HIV-1 genomes from 3,197 participants in a clinical trial with consistent and uniform sampling at sites across 35 countries and analyzed our data with another 2,632 genomes that comprehensively reflect the HIV-1 genetic diversity. We used maximum likelihood phylogenetic analysis coupled with geographical information to infer the state of ancestors. The majority of our sequenced genomes (n = 2,501) were either pure subtypes (A-D, F, and G) or CRF01_AE. The diversity and distribution of subtypes across geographical regions differed; USA showed the most homogenous subtype population, whereas African samples were most diverse. We delineated transmission of the four most prevalent subtypes in our dataset (A, B, C, and CRF01_AE), and our results suggest both continuous and frequent transmission of HIV-1 over country borders, as well as single transmission events being the seed of endemic population expansions. Overall, we show that coupling of genetic and geographical information of HIV-1 can be used to understand the origin and spread of pandemic pathogens.

9.
Nucleic Acids Res ; 36(Database issue): D102-6, 2008 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-18006571

RESUMEN

JASPAR is a popular open-access database for matrix models describing DNA-binding preferences for transcription factors and other DNA patterns. With its third major release, JASPAR has been expanded and equipped with additional functions aimed at both casual and power users. The heart of the JASPAR database-the JASPAR CORE sub-database-has increased by 12% in size, and three new specialized sub-databases have been added. New functions include clustering of matrix models by similarity, generation of random matrices by sampling from selected sets of existing models and a language-independent Web Service applications programming interface for matrix retrieval. JASPAR is available at http://jaspar.genereg.net.


Asunto(s)
Bases de Datos de Ácidos Nucleicos , Elementos Reguladores de la Transcripción , Factores de Transcripción/metabolismo , Acceso a la Información , Animales , Sitios de Unión , Biología Computacional , Interpretación Estadística de Datos , Humanos , Internet , Modelos Genéticos , Regiones Promotoras Genéticas , Sitios de Empalme de ARN , Programas Informáticos , Interfaz Usuario-Computador
10.
Neurol Genet ; 6(5): e508, 2020 Oct.
Artículo en Inglés | MEDLINE | ID: mdl-33134509

RESUMEN

OBJECTIVE: Dysregulation of type I collagen metabolism has a great impact on human health. We have previously seen that matrix metalloproteinase-degraded type I collagen (C1M) is associated with early death and age-related pathologies. To dissect the biological impact of type I collagen dysregulation, we have performed a genome-wide screening of the genetic factors related to type I collagen turnover. METHODS: Patient registry data and genotypes have been collected for a total of 4,981 Danish postmenopausal women. Genome-wide association with serum levels of C1M was assessed and phenotype-genotype association analysis performed. RESULTS: Twenty-two genome-wide significant variants associated with C1M were identified in the APOE-C1/TOMM40 gene cluster. The APOE-C1/TOMM40 gene cluster is associated with hyperlipidemia and cognitive disorders, and we further found that C1M levels correlated with tau degradation markers and were decreased in women with preclinical cognitive impairment. CONCLUSIONS: Our study provides elements for better understanding the role of the collagen metabolism in the onset of cognitive impairment.

11.
Nat Commun ; 11(1): 363, 2020 01 17.
Artículo en Inglés | MEDLINE | ID: mdl-31953409

RESUMEN

Infections have become the major cause of morbidity and mortality among patients with chronic lymphocytic leukemia (CLL) due to immune dysfunction and cytotoxic CLL treatment. Yet, predictive models for infection are missing. In this work, we develop the CLL Treatment-Infection Model (CLL-TIM) that identifies patients at risk of infection or CLL treatment within 2 years of diagnosis as validated on both internal and external cohorts. CLL-TIM is an ensemble algorithm composed of 28 machine learning algorithms based on data from 4,149 patients with CLL. The model is capable of dealing with heterogeneous data, including the high rates of missing data to be expected in the real-world setting, with a precision of 72% and a recall of 75%. To address concerns regarding the use of complex machine learning algorithms in the clinic, for each patient with CLL, CLL-TIM provides explainable predictions through uncertainty estimates and personalized risk factors.


Asunto(s)
Infecciones/diagnóstico , Leucemia Linfocítica Crónica de Células B/complicaciones , Aprendizaje Automático , Factores de Riesgo , Anciano , Algoritmos , Antineoplásicos/uso terapéutico , Benchmarking , Estudios de Cohortes , Bases de Datos Factuales , Femenino , Humanos , Infecciones/etiología , Estimación de Kaplan-Meier , Leucemia Linfocítica Crónica de Células B/tratamiento farmacológico , Masculino , Persona de Mediana Edad
12.
BMC Bioinformatics ; 10: 388, 2009 Nov 26.
Artículo en Inglés | MEDLINE | ID: mdl-19941641

RESUMEN

BACKGROUND: The accurate determination of transcription factor binding affinities is an important problem in biology and key to understanding the gene regulation process. Position weight matrices are commonly used to represent the binding properties of transcription factor binding sites but suffer from low information content and a large number of false matches in the genome. We describe a novel algorithm for the refinement of position weight matrices representing transcription factor binding sites based on experimental data, including ChIP-chip analyses. We present an iterative weight matrix optimization method that is more accurate in distinguishing true transcription factor binding sites from a negative control set. The initial position weight matrix comes from JASPAR, TRANSFAC or other sources. The main new features are the discriminative nature of the method and matrix width and length optimization. RESULTS: The algorithm was applied to the increasing collection of known transcription factor binding sites obtained from ChIP-chip experiments. The results show that our algorithm significantly improves the sensitivity and specificity of matrix models for identifying transcription factor binding sites. CONCLUSION: When the transcription factor is known, it is more appropriate to use a discriminative approach such as the one presented here to derive its transcription factor-DNA binding properties starting with a matrix, as opposed to performing de novo motif discovery. Generating more accurate position weight matrices will ultimately contribute to a better understanding of eukaryotic transcriptional regulation, and could potentially offer a better alternative to ab initio motif discovery.


Asunto(s)
Biología Computacional/métodos , Programas Informáticos , Sitios de Unión , ADN/química , ADN/metabolismo , Análisis de Secuencia por Matrices de Oligonucleótidos , Reconocimiento de Normas Patrones Automatizadas/métodos , Posición Específica de Matrices de Puntuación , Factores de Transcripción/química , Factores de Transcripción/metabolismo
13.
Nucleic Acids Res ; 35(Database issue): D732-6, 2007 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-17090589

RESUMEN

Estrogen has a profound impact on human physiology affecting transcription of numerous genes. To decipher functional characteristics of estrogen responsive genes, we developed KnowledgeBase for Estrogen Responsive Genes (KBERG). Genes in KBERG were derived from Estrogen Responsive Gene Database (ERGDB) and were analyzed from multiple aspects. We explored the possible transcription regulation mechanism by capturing highly conserved promoter motifs across orthologous genes, using promoter regions that cover the range of [-1200, +500] relative to the transcription start sites. The motif detection is based on ab initio discovery of common cis-elements from the orthologous gene cluster from human, mouse and rat, thus reflecting a degree of promoter sequence preservation during evolution. The identified motifs are linked to transcription factor binding sites based on the TRANSFAC database. In addition, KBERG uses two established ontology systems, GO and eVOC, to associate genes with their function. Users may assess gene functionality through the description terms in GO. Alternatively, they can gain gene co-expression information through evidence from human EST libraries via eVOC. KBERG is a user-friendly system that provides links to other relevant resources such as ERGDB, UniGene, Entrez Gene, HomoloGene, GO, eVOC and GenBank, and thus offers a platform for functional exploration and potential annotation of genes responsive to estrogen. KBERG database can be accessed at http://research.i2r.a-star.edu.sg/kberg.


Asunto(s)
Bases de Datos de Ácidos Nucleicos , Estrógenos/fisiología , Regiones Promotoras Genéticas , Factores de Transcripción/metabolismo , Animales , Secuencia de Bases , Sitios de Unión , Secuencia Conservada , Humanos , Internet , Ratones , Ratas , Análisis de Secuencia de ADN , Transcripción Genética , Interfaz Usuario-Computador
14.
Nucleic Acids Res ; 32(21): 6212-7, 2004.
Artículo en Inglés | MEDLINE | ID: mdl-15576347

RESUMEN

Estrogen has a profound impact on human physiology and affects numerous genes. The classical estrogen reaction is mediated by its receptors (ERs), which bind to the estrogen response elements (EREs) in target gene's promoter region. Due to tedious and expensive experiments, a limited number of human genes are functionally well characterized. It is still unclear how many and which human genes respond to estrogen treatment. We propose a simple, economic, yet effective computational method to predict a subclass of estrogen responsive genes. Our method relies on the similarity of ERE frames across different promoters in the human genome. Matching ERE frames of a test set of 60 known estrogen responsive genes to the collection of over 18,000 human promoters, we obtained 604 candidate genes. Evaluating our result by comparison with the published microarray data and literature, we found that more than half (53.6%, 324/604) of predicted candidate genes are responsive to estrogen. We believe this method can significantly reduce the number of testing potential estrogen target genes and provide functional clues for annotating part of genes that lack functional information.


Asunto(s)
Biología Computacional/métodos , Estrógenos/farmacología , Regulación de la Expresión Génica , Genómica/métodos , Perfilación de la Expresión Génica , Genoma Humano , Humanos , Regiones Promotoras Genéticas , Elementos de Respuesta
15.
Oncotarget ; 6(35): 37169-84, 2015 Nov 10.
Artículo en Inglés | MEDLINE | ID: mdl-26439695

RESUMEN

To better understand and characterize chromosomal structural variation during breast cancer progression, we enumerated chromosomal rearrangements for 11 patients by performing low-coverage whole-genome sequencing of 11 primary breast tumors and their 13 matched distant metastases. The tumor genomes harbored a median of 85 (range 18-404) rearrangements per tumor, with a median of 82 (26-310) in primaries compared to 87 (18-404) in distant metastases. Concordance between paired tumors from the same patient was high with a median of 89% of rearrangements shared (range 61-100%), whereas little overlap was found when comparing all possible pairings of tumors from different patients (median 3%). The tumors exhibited diverse genomic patterns of rearrangements: some carried events distributed throughout the genome while others had events mostly within densely clustered chromothripsis-like foci at a few chromosomal locations. Irrespectively, the patterns were highly conserved between the primary tumor and metastases from the same patient. Rearrangements occurred more frequently in genic areas than expected by chance and among the genes affected there was significant enrichment for cancer-associated genes including disruption of TP53, RB1, PTEN, and ESR1, likely contributing to tumor development. Our findings are most consistent with chromosomal rearrangements being early events in breast cancer progression that remain stable during the development from primary tumor to distant metastasis.


Asunto(s)
Biomarcadores de Tumor/genética , Neoplasias de la Mama/genética , Aberraciones Cromosómicas , Cromosomas Humanos/genética , Reordenamiento Génico , Genoma Humano , Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Adulto , Anciano , Neoplasias de la Mama/patología , Femenino , Humanos , Persona de Mediana Edad , Metástasis de la Neoplasia , ARN Mensajero/genética , Reacción en Cadena en Tiempo Real de la Polimerasa , Reacción en Cadena de la Polimerasa de Transcriptasa Inversa
16.
EMBO Mol Med ; 7(8): 1034-47, 2015 Aug.
Artículo en Inglés | MEDLINE | ID: mdl-25987569

RESUMEN

Metastatic breast cancer is usually diagnosed after becoming symptomatic, at which point it is rarely curable. Cell-free circulating tumor DNA (ctDNA) contains tumor-specific chromosomal rearrangements that may be interrogated in blood plasma. We evaluated serial monitoring of ctDNA for earlier detection of metastasis in a retrospective study of 20 patients diagnosed with primary breast cancer and long follow-up. Using an approach combining low-coverage whole-genome sequencing of primary tumors and quantification of tumor-specific rearrangements in plasma by droplet digital PCR, we identify for the first time that ctDNA monitoring is highly accurate for postsurgical discrimination between patients with (93%) and without (100%) eventual clinically detected recurrence. ctDNA-based detection preceded clinical detection of metastasis in 86% of patients with an average lead time of 11 months (range 0-37 months), whereas patients with long-term disease-free survival had undetectable ctDNA postoperatively. ctDNA quantity was predictive of poor survival. These findings establish the rationale for larger validation studies in early breast cancer to evaluate ctDNA as a monitoring tool for early metastasis detection, therapy modification, and to aid in avoidance of overtreatment.


Asunto(s)
Biomarcadores de Tumor/sangre , Neoplasias de la Mama/complicaciones , Neoplasias de la Mama/patología , ADN/sangre , Metástasis de la Neoplasia/diagnóstico , Femenino , Humanos , Estudios Longitudinales , Pronóstico , Estudios Retrospectivos
17.
Genome Med ; 7(1): 20, 2015.
Artículo en Inglés | MEDLINE | ID: mdl-25722745

RESUMEN

BACKGROUND: Breast cancer exhibits significant molecular, pathological, and clinical heterogeneity. Current clinicopathological evaluation is imperfect for predicting outcome, which results in overtreatment for many patients, and for others, leads to death from recurrent disease. Therefore, additional criteria are needed to better personalize care and maximize treatment effectiveness and survival. METHODS: To address these challenges, the Sweden Cancerome Analysis Network - Breast (SCAN-B) consortium was initiated in 2010 as a multicenter prospective study with longsighted aims to analyze breast cancers with next-generation genomic technologies for translational research in a population-based manner and integrated with healthcare; decipher fundamental tumor biology from these analyses; utilize genomic data to develop and validate new clinically-actionable biomarker assays; and establish real-time clinical implementation of molecular diagnostic, prognostic, and predictive tests. In the first phase, we focus on molecular profiling by next-generation RNA-sequencing on the Illumina platform. RESULTS: In the first 3 years from 30 August 2010 through 31 August 2013, we have consented and enrolled 3,979 patients with primary breast cancer at the seven hospital sites in South Sweden, representing approximately 85% of eligible patients in the catchment area. Preoperative blood samples have been collected for 3,942 (99%) patients and primary tumor specimens collected for 2,929 (74%) patients. Herein we describe the study infrastructure and protocols and present initial proof of concept results from prospective RNA sequencing including tumor molecular subtyping and detection of driver gene mutations. Prospective patient enrollment is ongoing. CONCLUSIONS: We demonstrate that large-scale population-based collection and RNA-sequencing analysis of breast cancer is feasible. The SCAN-B Initiative should significantly reduce the time to discovery, validation, and clinical implementation of novel molecular diagnostic and predictive tests. We welcome the participation of additional comprehensive cancer treatment centers. TRIAL REGISTRATION: ClinicalTrials.gov identifier NCT02306096.

18.
Genome Res ; 19(2): 255-65, 2009 Feb.
Artículo en Inglés | MEDLINE | ID: mdl-19074369

RESUMEN

Finding and characterizing mRNAs, their transcription start sites (TSS), and their associated promoters is a major focus in post-genome biology. Mammalian cells have at least 5-10 magnitudes more TSS than previously believed, and deeper sequencing is necessary to detect all active promoters in a given tissue. Here, we present a new method for high-throughput sequencing of 5' cDNA tags-DeepCAGE: merging the Cap Analysis of Gene Expression method with ultra-high-throughput sequence technology. We apply DeepCAGE to characterize 1.4 million sequenced TSS from mouse hippocampus and reveal a wealth of novel core promoters that are preferentially used in hippocampus: This is the most comprehensive promoter data set for any tissue to date. Using these data, we present evidence indicating a key role for the Arnt2 transcription factor in hippocampus gene regulation. DeepCAGE can also detect promoters used only in a small subset of cells within the complex tissue.


Asunto(s)
Hipocampo/metabolismo , Regiones Promotoras Genéticas/genética , Análisis de Secuencia de ADN/métodos , Animales , Sitios de Unión , Mapeo Cromosómico/métodos , Expresión Génica , Ratones , Ratones Endogámicos C57BL , Modelos Biológicos , Especificidad de Órganos/genética , Unión Proteica , Factores de Transcripción/metabolismo
19.
J Comput Biol ; 15(10): 1347-63, 2008 Dec.
Artículo en Inglés | MEDLINE | ID: mdl-19040368

RESUMEN

We present BayesMD, a Bayesian Motif Discovery model with several new features. Three different types of biological a priori knowledge are built into the framework in a modular fashion. A mixture of Dirichlets is used as prior over nucleotide probabilities in binding sites. It is trained on transcription factor (TF) databases in order to extract the typical properties of TF binding sites. In a similar fashion we train organism-specific priors for the background sequences. Lastly, we use a prior over the position of binding sites. This prior represents information complementary to the motif and background priors coming from conservation, local sequence complexity, nucleosome occupancy, etc. and assumptions about the number of occurrences. The Bayesian inference is carried out using a combination of exact marginalization (multinomial parameters) and sampling (over the position of sites). Robust sampling results are achieved using the advanced sampling method parallel tempering. In a post-analysis step candidate motifs with high marginal probability are found by searching among those motifs that contain sites that occur frequently. Thereby, maximum a posteriori inference for the motifs is avoided and the marginal probabilities can be used directly to assess the significance of the findings. The framework is benchmarked against other methods on a number of real and artificial data sets. The accompanying prediction server, documentation, software, models and data are available from http://bayesmd.binf.ku.dk/.


Asunto(s)
Algoritmos , Secuencias de Aminoácidos/genética , Teorema de Bayes , Modelos Genéticos , Análisis de Secuencia de ADN/métodos , Secuencia de Bases , Datos de Secuencia Molecular , Curva ROC , Alineación de Secuencia/métodos , Programas Informáticos
SELECCIÓN DE REFERENCIAS
Detalles de la búsqueda