Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 74
Filtrar
Más filtros

Banco de datos
Tipo del documento
Intervalo de año de publicación
1.
Genome Res ; 32(2): 403-408, 2022 02.
Artículo en Inglés | MEDLINE | ID: mdl-34965940

RESUMEN

Genotyping from sequencing is the basis of emerging strategies in the molecular breeding of polyploid plants. However, compared with the situation for diploids, in which genotyping accuracies are confidently determined with comprehensive benchmarks, polyploids have been neglected; there are no benchmarks measuring genotyping error rates for small variants using real sequencing reads. We previously introduced a variant calling method, Octopus, that accurately calls germline variants in diploids and somatic mutations in tumors. Here, we evaluate Octopus and other popular tools on whole-genome tetraploid and hexaploid data sets created using in silico mixtures of diploid Genome in a Bottle (GIAB) samples. We find that genotyping errors are abundant for typical sequencing depths but that Octopus makes 25% fewer errors than other methods on average. We supplement our benchmarks with concordance analysis in real autotriploid banana data sets.


Asunto(s)
Benchmarking , Poliploidía , Genotipo , Secuenciación de Nucleótidos de Alto Rendimiento , Humanos
2.
Nat Methods ; 17(11): 1118-1124, 2020 11.
Artículo en Inglés | MEDLINE | ID: mdl-33046896

RESUMEN

Predicting the impact of noncoding genetic variation requires interpreting it in the context of three-dimensional genome architecture. We have developed deepC, a transfer-learning-based deep neural network that accurately predicts genome folding from megabase-scale DNA sequence. DeepC predicts domain boundaries at high resolution, learns the sequence determinants of genome folding and predicts the impact of both large-scale structural and single base-pair variations.


Asunto(s)
Genoma Humano/genética , Genómica/métodos , Modelos Genéticos , Redes Neurales de la Computación , Secuencia de Bases , Factor de Unión a CCCTC/genética , Cromatina/genética , Simulación por Computador , Variación Estructural del Genoma , Humanos
3.
Stat Med ; 42(30): 5541-5554, 2023 12 30.
Artículo en Inglés | MEDLINE | ID: mdl-37850249

RESUMEN

We review popular unsupervised learning methods for the analysis of high-dimensional data encountered in, for example, genomics, medical imaging, cohort studies, and biobanks. We show that four commonly used methods, principal component analysis, K-means clustering, nonnegative matrix factorization, and latent Dirichlet allocation, can be written as probabilistic models underpinned by a low-rank matrix factorization. In addition to highlighting their similarities, this formulation clarifies the various assumptions and restrictions of each approach, which eases identifying the appropriate method for specific applications for applied medical researchers. We also touch upon the most important aspects of inference and model selection for the application of these methods to health data.


Asunto(s)
Algoritmos , Aprendizaje Automático no Supervisado , Humanos , Modelos Estadísticos , Genómica , Análisis por Conglomerados
5.
Clin Infect Dis ; 74(12): 2252-2260, 2022 07 06.
Artículo en Inglés | MEDLINE | ID: mdl-35022697

RESUMEN

BACKGROUND: Respiratory syncytial virus (RSV), parainfluenza virus (PIV), and human metapneumovirus (hMPV) are increasingly associated with chronic lung allograft dysfunction (CLAD) in lung transplant recipients (LTR). This systematic review primarily aimed to assess outcomes of RSV/PIV/hMPV infections in LTR and secondarily to assess evidence regarding the efficacy of ribavirin. METHODS: Relevant databases were queried and study outcomes extracted using a standardized method and summarized. RESULTS: Nineteen retrospective and 12 prospective studies were included (total 1060 cases). Pooled 30-day mortality was low (0-3%), but CLAD progression 180-360 days postinfection was substantial (pooled incidences 19-24%) and probably associated with severe infection. Ribavirin trended toward effectiveness for CLAD prevention in exploratory meta-analysis (odds ratio [OR] 0.61, [0.27-1.18]), although results were highly variable between studies. CONCLUSIONS: RSV/PIV/hMPV infection was followed by a high CLAD incidence. Treatment options, including ribavirin, are limited. There is an urgent need for high-quality studies to provide better treatment options for these infections.


Asunto(s)
Metapneumovirus , Infecciones por Paramyxoviridae , Infecciones por Virus Sincitial Respiratorio , Virus Sincitial Respiratorio Humano , Infecciones del Sistema Respiratorio , Humanos , Pulmón , Virus de la Parainfluenza 1 Humana , Virus de la Parainfluenza 2 Humana , Infecciones por Paramyxoviridae/tratamiento farmacológico , Infecciones por Paramyxoviridae/epidemiología , Estudios Prospectivos , Infecciones por Virus Sincitial Respiratorio/tratamiento farmacológico , Infecciones por Virus Sincitial Respiratorio/epidemiología , Infecciones del Sistema Respiratorio/tratamiento farmacológico , Infecciones del Sistema Respiratorio/epidemiología , Estudios Retrospectivos , Ribavirina/uso terapéutico , Receptores de Trasplantes
6.
Eur Respir J ; 60(5)2022 11.
Artículo en Inglés | MEDLINE | ID: mdl-35896214

RESUMEN

BACKGROUND: Coronavirus disease 2019 (COVID-19) social distancing measures led to a dramatic decline in non-COVID-19 respiratory virus infections, providing a unique opportunity to study their impact on annual forced expiratory volume in 1 s (FEV1) decline, episodes of temporary drop in lung function (TDLF) suggestive of infection and chronic lung allograft dysfunction (CLAD) in lung transplant recipients (LTRs). METHODS: All FEV1 values of LTRs transplanted between 2009 and April 2020 at the University Medical Center Groningen (Groningen, The Netherlands) were included. Annual FEV1 change was estimated with separate estimates for pre-social distancing (2009-2020) and the year with social distancing measures (2020-2021). Patients were grouped by individual TDLF frequency (frequent/infrequent). Respiratory virus circulation was derived from weekly hospital-wide respiratory virus infection rates. Effect modification by TDLF frequency and respiratory virus circulation was assessed. CLAD and TDLF rates were analysed over time. RESULTS: 479 LTRs (12 775 FEV1 values) were included. Pre-social distancing annual change in FEV1 was -114 (95% CI -133- -94) mL, while during social distancing FEV1 did not decline: 5 (95% CI -38-48) mL (difference pre-social distancing versus during social distancing: p<0.001). The frequent TDLF subgroup showed faster annual FEV1 decline compared with the infrequent TDLF subgroup (-150 (95% CI -181- -120) versus -90 (95% CI -115- -65) mL; p=0.003). During social distancing, we found significantly lower odds for any TDLF (OR 0.53, 95% CI 0.33-0.85; p=0.008) and severe TDLF (OR 0.34, 0.16-0.71; p=0.005) as well as lower CLAD incidence (OR 0.53, 95% CI 0.27-1.02; p=0.060). Effect modification by respiratory virus circulation indicated a significant association between TDLF/CLAD and respiratory viruses. CONCLUSIONS: During COVID-19 social distancing the strong reduction in respiratory virus circulation coincided with markedly less FEV1 decline, fewer episodes of TDLF and possibly less CLAD. Effect modification by respiratory virus circulation suggests an important role for respiratory viruses in lung function decline in LTRs.


Asunto(s)
COVID-19 , Trasplante de Pulmón , Virus , Humanos , Receptores de Trasplantes , Distanciamiento Físico , Estudios de Seguimiento , Pulmón
7.
Gynecol Oncol ; 164(2): 265-270, 2022 02.
Artículo en Inglés | MEDLINE | ID: mdl-34955237

RESUMEN

BACKGROUND: Laparoscopic hysterectomy is accepted worldwide as the standard treatment option for early-stage endometrial cancer. However, there are limited data on long-term survival, particularly when no lymphadenectomy is performed. We compared the survival outcomes of total laparoscopic hysterectomy (TLH) and total abdominal hysterectomy (TAH), both without lymphadenectomy, for early-stage endometrial cancer up to 5 years postoperatively. METHODS: Follow-up of a multi-centre, randomised controlled trial comparing TLH and TAH, without routine lymphadenectomy, for women with stage I endometrial cancer. Enrolment was between 2007 and 2009 by 2:1 randomisation to TLH or TAH. Outcomes were disease-free survival (DFS), overall survival (OS), disease-specific survival (DSS), and primary site of recurrence. Multivariable Cox regression analyses were adjusted for age, stage, grade, and radiotherapy with adjusted hazard ratios (aHR) and 95% confidence intervals (95%CI) reported. To test for significance, non-inferiority margins were defined. RESULTS: In total, 279 women underwent a surgical procedure, of whom 263 (94%) had follow-up data. For the TLH (n = 175) and TAH (n = 88) groups, DFS (90.3% vs 84.1%; aHR[recurrence], 0.69; 95%CI, 0.31-1.52), OS (89.2% vs 82.8%; aHR[death], 0.60; 95%CI, 0.30-1.19), and DSS (95.0% vs 89.8%; aHR[death], 0.62; 95%CI, 0.23-1.70) were reported at 5 years. At a 10% significance level, and with a non-inferiority margin of 0.20, the null hypothesis of inferiority was rejected for all three outcomes. There were no port-site or wound metastases, and local recurrence rates were comparable. CONCLUSION: Disease recurrence and 5-year survival rates were comparable between the TLH and TAH groups and comparable to studies with lymphadenectomy, supporting the widespread use of TLH without lymphadenectomy as the primary treatment for early-stage, low-grade endometrial cancer.


Asunto(s)
Carcinoma Endometrioide/cirugía , Neoplasias Endometriales/cirugía , Histerectomía/métodos , Recurrencia Local de Neoplasia/epidemiología , Adulto , Anciano , Anciano de 80 o más Años , Carcinoma Endometrioide/mortalidad , Carcinoma Endometrioide/patología , Supervivencia sin Enfermedad , Neoplasias Endometriales/mortalidad , Neoplasias Endometriales/patología , Femenino , Humanos , Laparoscopía/métodos , Laparotomía/métodos , Escisión del Ganglio Linfático , Persona de Mediana Edad , Clasificación del Tumor , Estadificación de Neoplasias , Radioterapia Adyuvante
8.
Proc Natl Acad Sci U S A ; 116(45): 22664-22672, 2019 11 05.
Artículo en Inglés | MEDLINE | ID: mdl-31636219

RESUMEN

In order to produce effective antibodies, B cells undergo rapid somatic hypermutation (SHM) and selection for binding affinity to antigen via a process called affinity maturation. The similarities between this process and evolution by natural selection have led many groups to use phylogenetic methods to characterize the development of immunological memory, vaccination, and other processes that depend on affinity maturation. However, these applications are limited by the fact that most phylogenetic models are designed to be applied to individual lineages comprising genetically diverse sequences, while B cell repertoires often consist of hundreds to thousands of separate low-diversity lineages. Further, several features of affinity maturation violate important assumptions in standard phylogenetic models. Here, we introduce a hierarchical phylogenetic framework that integrates information from all lineages in a repertoire to more precisely estimate model parameters while simultaneously incorporating the unique features of SHM. We demonstrate the power of this repertoire-wide approach by characterizing previously undescribed phenomena in affinity maturation. First, we find evidence consistent with age-related changes in SHM hot-spot targeting. Second, we identify a consistent relationship between increased tree length and signs of increased negative selection, apparent in the repertoires of recently vaccinated subjects and those without any known recent infections or vaccinations. This suggests that B cell lineages shift toward negative selection over time as a general feature of affinity maturation. Our study provides a framework for undertaking repertoire-wide phylogenetic testing of SHM hypotheses and provides a means of characterizing dynamics of mutation and selection during affinity maturation.


Asunto(s)
Envejecimiento/genética , Linfocitos B/inmunología , Evolución Molecular , Filogenia , Vacunación , Humanos , Mutación
9.
BMC Genomics ; 21(1): 176, 2020 Feb 22.
Artículo en Inglés | MEDLINE | ID: mdl-32087698

RESUMEN

BACKGROUND: Vaccines have greatly reduced the burden of infectious disease, ranking in their impact on global health second only after clean water. Most vaccines confer protection by the production of antibodies with binding affinity for the antigen, which is the main effector function of B cells. This results in short term changes in the B cell receptor (BCR) repertoire when an immune response is launched, and long term changes when immunity is conferred. Analysis of antibodies in serum is usually used to evaluate vaccine response, however this is limited and therefore the investigation of the BCR repertoire provides far more detail for the analysis of vaccine response. RESULTS: Here, we introduce a novel Bayesian model to describe the observed distribution of BCR sequences and the pattern of sharing across time and between individuals, with the goal to identify vaccine-specific BCRs. We use data from two studies to assess the model and estimate that we can identify vaccine-specific BCRs with 69% sensitivity. CONCLUSION: Our results demonstrate that statistical modelling can capture patterns associated with vaccine response and identify vaccine specific B cells in a range of different data sets. Additionally, the B cells we identify as vaccine specific show greater levels of sequence similarity than expected, suggesting that there are additional signals of vaccine response, not currently considered, which could improve the identification of vaccine specific B cells.


Asunto(s)
Linfocitos B/inmunología , Modelos Inmunológicos , Vacunas , Teorema de Bayes , Hepatitis B , Humanos , Gripe Humana
10.
Bioinformatics ; 35(5): 798-806, 2019 03 01.
Artículo en Inglés | MEDLINE | ID: mdl-30165547

RESUMEN

MOTIVATION: The Li and Stephens model, which approximates the coalescent describing the pattern of variation in a population, underpins a range of key tools and results in genetics. Although highly efficient compared to the coalescent, standard implementations of this model still cannot deal with the very large reference cohorts that are starting to become available, and practical implementations use heuristics to achieve reasonable runtimes. RESULTS: Here I describe a new, exact algorithm ('fastLS') that implements the Li and Stephens model and achieves runtimes independent of the size of the reference cohort. Key to achieving this runtime is the use of the Burrows-Wheeler transform, allowing the algorithm to efficiently identify partial haplotype matches across a cohort. I show that the proposed data structure is very similar to, and generalizes, Durbin's positional Burrows-Wheeler transform.


Asunto(s)
Algoritmos , Haplotipos , Estudios de Cohortes
11.
Bioinformatics ; 35(13): 2177-2184, 2019 07 01.
Artículo en Inglés | MEDLINE | ID: mdl-30481258

RESUMEN

MOTIVATION: Convolutional neural networks (CNNs) have been tremendously successful in many contexts, particularly where training data are abundant and signal-to-noise ratios are large. However, when predicting noisily observed phenotypes from DNA sequence, each training instance is only weakly informative, and the amount of training data is often fundamentally limited, emphasizing the need for methods that make optimal use of training data and any structure inherent in the process. RESULTS: Here we show how to combine equivariant networks, a general mathematical framework for handling exact symmetries in CNNs, with Bayesian dropout, a version of Monte Carlo dropout suggested by a reinterpretation of dropout as a variational Bayesian approximation, to develop a model that exhibits exact reverse-complement symmetry and is more resistant to overtraining. We find that this model combines improved prediction consistency with better predictive accuracy compared to standard CNN implementations and state-of-art motif finders. We use our network to predict recombination hotspots from sequence, and identify binding motifs for the recombination-initiation protein PRDM9 previously unobserved in this data, which were recently validated by high-resolution assays. The network achieves a predictive accuracy comparable to that attainable by a direct assay of the H3K4me3 histone mark, a proxy for PRDM9 binding. AVAILABILITY AND IMPLEMENTATION: https://github.com/luntergroup/EquivariantNetworks. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Asunto(s)
Redes Neurales de la Computación , Teorema de Bayes , Proteínas , Recombinación Genética
12.
Comput Stat ; 35(3): 1319-1344, 2020.
Artículo en Inglés | MEDLINE | ID: mdl-32764847

RESUMEN

Expectation maximization (EM) is a technique for estimating maximum-likelihood parameters of a latent variable model given observed data by alternating between taking expectations of sufficient statistics, and maximizing the expected log likelihood. For situations where sufficient statistics are intractable, stochastic approximation EM (SAEM) is often used, which uses Monte Carlo techniques to approximate the expected log likelihood. Two common implementations of SAEM, Batch EM (BEM) and online EM (OEM), are parameterized by a "learning rate", and their efficiency depend strongly on this parameter. We propose an extension to the OEM algorithm, termed Introspective Online Expectation Maximization (IOEM), which removes the need for specifying this parameter by adapting the learning rate to trends in the parameter updates. We show that our algorithm matches the efficiency of the optimal BEM and OEM algorithms in multiple models, and that the efficiency of IOEM can exceed that of BEM/OEM methods with optimal learning rates when the model has many parameters. Finally we use IOEM to fit two models to a financial time series. A Python implementation is available at https://github.com/luntergroup/IOEM.git.

13.
BMC Genomics ; 19(1): 115, 2018 02 01.
Artículo en Inglés | MEDLINE | ID: mdl-29390960

RESUMEN

BACKGROUND: Transposable elements (TEs) are mobile genetic sequences that randomly propagate within their host's genome. This mobility has the potential to affect gene transcription and cause disease. However, TEs are technically challenging to identify, which complicates efforts to assess the impact of TE insertions on disease. Here we present a targeted sequencing protocol and computational pipeline to identify polymorphic and novel TE insertions using next-generation sequencing: TE-NGS. The method simultaneously targets the three subfamilies that are responsible for the majority of recent TE activity (L1HS, AluYa5/8, and AluYb8/9) thereby obviating the need for multiple experiments and reducing the amount of input material required. RESULTS: Here we describe the laboratory protocol and detection algorithm, and a benchmark experiment for the reference genome NA12878. We demonstrate a substantial enrichment for on-target fragments, and high sensitivity and precision to both reference and NA12878-specific insertions. We report 17 previously unreported loci for this individual which are supported by orthogonal long-read evidence, and we identify 1470 polymorphic and novel TEs in 12 additional samples that were previously undocumented in databases of insertion polymorphisms. CONCLUSIONS: We anticipate that future applications of TE-NGS alongside exome sequencing of patients with sporadic disease will reduce the number of unresolved cases, and improve estimates of the contribution of TEs to human genetic disease.


Asunto(s)
Algoritmos , Elementos Transponibles de ADN , Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Polimorfismo de Nucleótido Simple , Biblioteca de Genes , Humanos
14.
Nature ; 486(7404): 527-31, 2012 Jun 28.
Artículo en Inglés | MEDLINE | ID: mdl-22722832

RESUMEN

Two African apes are the closest living relatives of humans: the chimpanzee (Pan troglodytes) and the bonobo (Pan paniscus). Although they are similar in many respects, bonobos and chimpanzees differ strikingly in key social and sexual behaviours, and for some of these traits they show more similarity with humans than with each other. Here we report the sequencing and assembly of the bonobo genome to study its evolutionary relationship with the chimpanzee and human genomes. We find that more than three per cent of the human genome is more closely related to either the bonobo or the chimpanzee genome than these are to each other. These regions allow various aspects of the ancestry of the two ape species to be reconstructed. In addition, many of the regions that overlap genes may eventually help us understand the genetic basis of phenotypes that humans share with one of the two apes to the exclusion of the other.


Asunto(s)
Evolución Molecular , Variación Genética/genética , Genoma Humano/genética , Genoma/genética , Pan paniscus/genética , Pan troglodytes/genética , Animales , Elementos Transponibles de ADN/genética , Duplicación de Gen/genética , Genotipo , Humanos , Datos de Secuencia Molecular , Fenotipo , Filogenia , Especificidad de la Especie
15.
Nature ; 483(7388): 169-75, 2012 Mar 07.
Artículo en Inglés | MEDLINE | ID: mdl-22398555

RESUMEN

Gorillas are humans' closest living relatives after chimpanzees, and are of comparable importance for the study of human origins and evolution. Here we present the assembly and analysis of a genome sequence for the western lowland gorilla, and compare the whole genomes of all extant great ape genera. We propose a synthesis of genetic and fossil evidence consistent with placing the human-chimpanzee and human-chimpanzee-gorilla speciation events at approximately 6 and 10 million years ago. In 30% of the genome, gorilla is closer to human or chimpanzee than the latter are to each other; this is rarer around coding genes, indicating pervasive selection throughout great ape evolution, and has functional consequences in gene expression. A comparison of protein coding genes reveals approximately 500 genes showing accelerated evolution on each of the gorilla, human and chimpanzee lineages, and evidence for parallel acceleration, particularly of genes involved in hearing. We also compare the western and eastern gorilla species, estimating an average sequence divergence time 1.75 million years ago, but with evidence for more recent genetic exchange and a population bottleneck in the eastern species. The use of the genome sequence in these and future analyses will promote a deeper understanding of great ape biology and evolution.


Asunto(s)
Evolución Molecular , Especiación Genética , Genoma/genética , Gorilla gorilla/genética , Animales , Femenino , Regulación de la Expresión Génica , Variación Genética/genética , Genómica , Humanos , Macaca mulatta/genética , Datos de Secuencia Molecular , Pan troglodytes/genética , Filogenia , Pongo/genética , Proteínas/genética , Alineación de Secuencia , Especificidad de la Especie , Transcripción Genética
16.
Mol Biol Evol ; 33(5): 1147-57, 2016 05.
Artículo en Inglés | MEDLINE | ID: mdl-26802217

RESUMEN

B-cell receptors (BCRs) are membrane-bound immunoglobulins that recognize and bind foreign proteins (antigens). BCRs are formed through random somatic changes of germline DNA, creating a vast repertoire of unique sequences that enable individuals to recognize a diverse range of antigens. After encountering antigen for the first time, BCRs undergo a process of affinity maturation, whereby cycles of rapid somatic mutation and selection lead to improved antigen binding. This constitutes an accelerated evolutionary process that takes place over days or weeks. Next-generation sequencing of the gene regions that determine BCR binding has begun to reveal the diversity and dynamics of BCR repertoires in unprecedented detail. Although this new type of sequence data has the potential to revolutionize our understanding of infection dynamics, quantitative analysis is complicated by the unique biology and high diversity of BCR sequences. Models and concepts from molecular evolution and phylogenetics that have been applied successfully to rapidly evolving pathogen populations are increasingly being adopted to study BCR diversity and divergence within individuals. However, BCR dynamics may violate key assumptions of many standard evolutionary methods, as they do not descend from a single ancestor, and experience biased mutation. Here, we review the application of evolutionary models to BCR repertoires and discuss the issues we believe need be addressed for this interdisciplinary field to flourish.


Asunto(s)
Infecciones/genética , Receptores de Antígenos de Linfocitos B/genética , Inmunidad Adaptativa/genética , Afinidad de Anticuerpos , Linfocitos B/inmunología , Linfocitos B/metabolismo , Evolución Molecular , Variación Genética/genética , Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Humanos , Inmunoglobulinas/genética , Inmunoglobulinas/metabolismo , Infecciones/inmunología , Proteínas de la Membrana/genética , Proteínas de la Membrana/metabolismo , Mutación , Receptores de Antígenos de Linfocitos B/inmunología , Receptores de Antígenos de Linfocitos B/metabolismo
17.
J Immunol ; 194(1): 252-261, 2015 Jan 01.
Artículo en Inglés | MEDLINE | ID: mdl-25392534

RESUMEN

High-throughput sequencing allows detailed study of the BCR repertoire postimmunization, but it remains unclear to what extent the de novo identification of Ag-specific sequences from the total BCR repertoire is possible. A conjugate vaccine containing Haemophilus influenzae type b (Hib) and group C meningococcal polysaccharides, as well as tetanus toxoid (TT), was used to investigate the BCR repertoire of adult humans following immunization and to test the hypothesis that public or convergent repertoire analysis could identify Ag-specific sequences. A number of Ag-specific BCR sequences have been reported for Hib and TT, which made a vaccine containing these two Ags an ideal immunological stimulus. Analysis of identical CDR3 amino acid sequences that were shared by individuals in the postvaccine repertoire identified a number of known Hib-specific sequences but only one previously described TT sequence. The extension of this analysis to nonidentical, but highly similar, CDR3 amino acid sequences revealed a number of other TT-related sequences. The anti-Hib avidity index postvaccination strongly correlated with the relative frequency of Hib-specific sequences, indicating that the postvaccination public BCR repertoire may be related to more conventional measures of immunogenicity correlating with disease protection. Analysis of public BCR repertoire provided evidence of convergent BCR evolution in individuals exposed to the same Ags. If this finding is confirmed, the public repertoire could be used for rapid and direct identification of protective Ag-specific BCR sequences from peripheral blood.


Asunto(s)
Cadenas Pesadas de Inmunoglobulina/genética , Cadenas Pesadas de Inmunoglobulina/inmunología , Receptores de Antígenos de Linfocitos B/inmunología , Vacunas Combinadas/inmunología , Vacunas Conjugadas/inmunología , Adolescente , Adulto , Anciano , Secuencia de Aminoácidos , Anticuerpos Antibacterianos/inmunología , Linfocitos B/inmunología , Cápsulas Bacterianas/inmunología , Vacunas contra Haemophilus/inmunología , Haemophilus influenzae tipo b/inmunología , Secuenciación de Nucleótidos de Alto Rendimiento , Humanos , Inmunoglobulina A/inmunología , Inmunoglobulina G/inmunología , Inmunoglobulina M/inmunología , Vacunas Meningococicas/inmunología , Persona de Mediana Edad , Polisacáridos Bacterianos/inmunología , Análisis de Secuencia de Proteína , Toxoide Tetánico/inmunología , Adulto Joven
18.
Nature ; 469(7331): 529-33, 2011 Jan 27.
Artículo en Inglés | MEDLINE | ID: mdl-21270892

RESUMEN

'Orang-utan' is derived from a Malay term meaning 'man of the forest' and aptly describes the southeast Asian great apes native to Sumatra and Borneo. The orang-utan species, Pongo abelii (Sumatran) and Pongo pygmaeus (Bornean), are the most phylogenetically distant great apes from humans, thereby providing an informative perspective on hominid evolution. Here we present a Sumatran orang-utan draft genome assembly and short read sequence data from five Sumatran and five Bornean orang-utan genomes. Our analyses reveal that, compared to other primates, the orang-utan genome has many unique features. Structural evolution of the orang-utan genome has proceeded much more slowly than other great apes, evidenced by fewer rearrangements, less segmental duplication, a lower rate of gene family turnover and surprisingly quiescent Alu repeats, which have played a major role in restructuring other primate genomes. We also describe a primate polymorphic neocentromere, found in both Pongo species, emphasizing the gradual evolution of orang-utan genome structure. Orang-utans have extremely low energy usage for a eutherian mammal, far lower than their hominid relatives. Adding their genome to the repertoire of sequenced primates illuminates new signals of positive selection in several pathways including glycolipid metabolism. From the population perspective, both Pongo species are deeply diverse; however, Sumatran individuals possess greater diversity than their Bornean counterparts, and more species-specific variation. Our estimate of Bornean/Sumatran speciation time, 400,000 years ago, is more recent than most previous studies and underscores the complexity of the orang-utan speciation process. Despite a smaller modern census population size, the Sumatran effective population size (N(e)) expanded exponentially relative to the ancestral N(e) after the split, while Bornean N(e) declined over the same period. Overall, the resources and analyses presented here offer new opportunities in evolutionary genomics, insights into hominid biology, and an extensive database of variation for conservation efforts.


Asunto(s)
Variación Genética , Genoma/genética , Pongo abelii/genética , Pongo pygmaeus/genética , Animales , Centrómero/genética , Cerebrósidos/metabolismo , Cromosomas , Evolución Molecular , Femenino , Reordenamiento Génico/genética , Especiación Genética , Genética de Población , Humanos , Masculino , Filogenia , Densidad de Población , Dinámica Poblacional , Especificidad de la Especie
19.
PLoS Genet ; 10(7): e1004525, 2014 Jul.
Artículo en Inglés | MEDLINE | ID: mdl-25057982

RESUMEN

Ten years on from the finishing of the human reference genome sequence, it remains unclear what fraction of the human genome confers function, where this sequence resides, and how much is shared with other mammalian species. When addressing these questions, functional sequence has often been equated with pan-mammalian conserved sequence. However, functional elements that are short-lived, including those contributing to species-specific biology, will not leave a footprint of long-lasting negative selection. Here, we address these issues by identifying and characterising sequence that has been constrained with respect to insertions and deletions for pairs of eutherian genomes over a range of divergences. Within noncoding sequence, we find increasing amounts of mutually constrained sequence as species pairs become more closely related, indicating that noncoding constrained sequence turns over rapidly. We estimate that half of present-day noncoding constrained sequence has been gained or lost in approximately the last 130 million years (half-life in units of divergence time, d1/2 = 0.25-0.31). While enriched with ENCODE biochemical annotations, much of the short-lived constrained sequences we identify are not detected by models optimized for wider pan-mammalian conservation. Constrained DNase 1 hypersensitivity sites, promoters and untranslated regions have been more evolutionarily stable than long noncoding RNA loci which have turned over especially rapidly. By contrast, protein coding sequence has been highly stable, with an estimated half-life of over a billion years (d1/2 = 2.1-5.0). From extrapolations we estimate that 8.2% (7.1-9.2%) of the human genome is presently subject to negative selection and thus is likely to be functional, while only 2.2% has maintained constraint in both human and mouse since these species diverged. These results reveal that the evolutionary history of the human genome has been highly dynamic, particularly for its noncoding yet biologically functional fraction.


Asunto(s)
Secuencia Conservada/genética , Evolución Molecular , Genoma Humano , Eliminación de Secuencia/genética , Animales , Secuencia de Bases , Hominidae , Humanos , Ratones , Sistemas de Lectura Abierta , Alineación de Secuencia , Especificidad de la Especie
20.
Genome Res ; 23(5): 749-61, 2013 May.
Artículo en Inglés | MEDLINE | ID: mdl-23478400

RESUMEN

Short insertions and deletions (indels) are the second most abundant form of human genetic variation, but our understanding of their origins and functional effects lags behind that of other types of variants. Using population-scale sequencing, we have identified a high-quality set of 1.6 million indels from 179 individuals representing three diverse human populations. We show that rates of indel mutagenesis are highly heterogeneous, with 43%-48% of indels occurring in 4.03% of the genome, whereas in the remaining 96% their prevalence is 16 times lower than SNPs. Polymerase slippage can explain upwards of three-fourths of all indels, with the remainder being mostly simple deletions in complex sequence. However, insertions do occur and are significantly associated with pseudo-palindromic sequence features compatible with the fork stalling and template switching (FoSTeS) mechanism more commonly associated with large structural variations. We introduce a quantitative model of polymerase slippage, which enables us to identify indel-hypermutagenic protein-coding genes, some of which are associated with recurrent mutations leading to disease. Accounting for mutational rate heterogeneity due to sequence context, we find that indels across functional sequence are generally subject to stronger purifying selection than SNPs. We find that indel length modulates selection strength, and that indels affecting multiple functionally constrained nucleotides undergo stronger purifying selection. We further find that indels are enriched in associations with gene expression and find evidence for a contribution of nonsense-mediated decay. Finally, we show that indels can be integrated in existing genome-wide association studies (GWAS); although we do not find direct evidence that potentially causal protein-coding indels are enriched with associations to known disease-associated SNPs, our findings suggest that the causal variant underlying some of these associations may be indels.


Asunto(s)
Evolución Molecular , Genoma Humano , Mutación INDEL/genética , Genética de Población , Estudio de Asociación del Genoma Completo , Secuenciación de Nucleótidos de Alto Rendimiento , Humanos , Mutagénesis Insercional , Tasa de Mutación , Polimorfismo de Nucleótido Simple
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA