Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 2.535
Filtrar
Más filtros

Intervalo de año de publicación
1.
Cell ; 176(3): 663-675.e19, 2019 01 24.
Artículo en Inglés | MEDLINE | ID: mdl-30661756

RESUMEN

In order to provide a comprehensive resource for human structural variants (SVs), we generated long-read sequence data and analyzed SVs for fifteen human genomes. We sequence resolved 99,604 insertions, deletions, and inversions including 2,238 (1.6 Mbp) that are shared among all discovery genomes with an additional 13,053 (6.9 Mbp) present in the majority, indicating minor alleles or errors in the reference. Genotyping in 440 additional genomes confirms the most common SVs in unique euchromatin are now sequence resolved. We report a ninefold SV bias toward the last 5 Mbp of human chromosomes with nearly 55% of all VNTRs (variable number of tandem repeats) mapping to this portion of the genome. We identify SVs affecting coding and noncoding regulatory loci improving annotation and interpretation of functional variation. These data provide the framework to construct a canonical human reference and a resource for developing advanced representations capable of capturing allelic diversity.


Asunto(s)
Frecuencia de los Genes/genética , Genoma Humano/genética , Variación Estructural del Genoma/genética , Alelos , Eucromatina/genética , Genómica/métodos , Humanos , Repeticiones de Minisatélite/genética , Análisis de Secuencia de ADN/métodos
2.
Annu Rev Genomics Hum Genet ; 25(1): 77-104, 2024 Aug.
Artículo en Inglés | MEDLINE | ID: mdl-38663087

RESUMEN

The Human Genome Project was an enormous accomplishment, providing a foundation for countless explorations into the genetics and genomics of the human species. Yet for many years, the human genome reference sequence remained incomplete and lacked representation of human genetic diversity. Recently, two major advances have emerged to address these shortcomings: complete gap-free human genome sequences, such as the one developed by the Telomere-to-Telomere Consortium, and high-quality pangenomes, such as the one developed by the Human Pangenome Reference Consortium. Facilitated by advances in long-read DNA sequencing and genome assembly algorithms, complete human genome sequences resolve regions that have been historically difficult to sequence, including centromeres, telomeres, and segmental duplications. In parallel, pangenomes capture the extensive genetic diversity across populations worldwide. Together, these advances usher in a new era of genomics research, enhancing the accuracy of genomic analysis, paving the path for precision medicine, and contributing to deeper insights into human biology.


Asunto(s)
Genoma Humano , Proyecto Genoma Humano , Humanos , Variación Genética , Genómica/métodos , Análisis de Secuencia de ADN/métodos , Telómero/genética
3.
Brief Bioinform ; 25(2)2024 Jan 22.
Artículo en Inglés | MEDLINE | ID: mdl-38344864

RESUMEN

Bacteriophages can help the treatment of bacterial infections yet require in-silico models to deal with the great genetic diversity between phages and bacteria. Despite the tolerable prediction performance, the application scope of current approaches is limited to the prediction at the species level, which cannot accurately predict the relationship of phages across strain mutants. This has hindered the development of phage therapeutics based on the prediction of phage-bacteria relationships. In this paper, we present, PB-LKS, to predict the phage-bacteria interaction based on local K-mer strategy with higher performance and wider applicability. The utility of PB-LKS is rigorously validated through (i) large-scale historical screening, (ii) case study at the class level and (iii) in vitro simulation of bacterial antiphage resistance at the strain mutant level. The PB-LKS approach could outperform the current state-of-the-art methods and illustrate potential clinical utility in pre-optimized phage therapy design.


Asunto(s)
Infecciones Bacterianas , Bacteriófagos , Humanos , Bacteriófagos/genética , Bacterias/genética
4.
Brief Bioinform ; 25(3)2024 Mar 27.
Artículo en Inglés | MEDLINE | ID: mdl-38581418

RESUMEN

Following the milestone success of the Human Genome Project, the 'Encyclopedia of DNA Elements (ENCODE)' initiative was launched in 2003 to unearth information about the numerous functional elements within the genome. This endeavor coincided with the emergence of numerous novel technologies, accompanied by the provision of vast amounts of whole-genome sequences, high-throughput data such as ChIP-Seq and RNA-Seq. Extracting biologically meaningful information from this massive dataset has become a critical aspect of many recent studies, particularly in annotating and predicting the functions of unknown genes. The core idea behind genome annotation is to identify genes and various functional elements within the genome sequence and infer their biological functions. Traditional wet-lab experimental methods still rely on extensive efforts for functional verification. However, early bioinformatics algorithms and software primarily employed shallow learning techniques; thus, the ability to characterize data and features learning was limited. With the widespread adoption of RNA-Seq technology, scientists from the biological community began to harness the potential of machine learning and deep learning approaches for gene structure prediction and functional annotation. In this context, we reviewed both conventional methods and contemporary deep learning frameworks, and highlighted novel perspectives on the challenges arising during annotation underscoring the dynamic nature of this evolving scientific landscape.


Asunto(s)
Aprendizaje Profundo , Humanos , Genoma , Algoritmos , Programas Informáticos , Biología Computacional/métodos , Anotación de Secuencia Molecular
5.
Plant J ; 2024 Jul 29.
Artículo en Inglés | MEDLINE | ID: mdl-39073886

RESUMEN

Genetic screens are powerful tools for biological research and are one of the reasons for the success of the thale cress Arabidopsis thaliana as a research model. Here, we describe the whole-genome sequencing of 871 Arabidopsis lines from the Homozygous EMS Mutant (HEM) collection as a novel resource for forward and reverse genetics. With an average 576 high-confidence mutations per HEM line, over three independent mutations altering protein sequences are found on average per gene in the collection. Pilot reverse genetics experiments on reproductive, developmental, immune and physiological traits confirmed the efficacy of the tool for identifying both null, knockdown and gain-of-function alleles. The possibility of conducting subtle repeated phenotyping and the immediate availability of the mutations will empower forward genetic approaches. The sequence resource is searchable with the ATHEM web interface (https://lipm-browsers.toulouse.inra.fr/pub/ATHEM/), and the biological material is distributed by the Versailles Arabidopsis Stock Center.

6.
Brief Bioinform ; 24(5)2023 09 20.
Artículo en Inglés | MEDLINE | ID: mdl-37466138

RESUMEN

Accurately identifying phage-host relationships from their genome sequences is still challenging, especially for those phages and hosts with less homologous sequences. In this work, focusing on identifying the phage-host relationships at the species and genus level, we propose a contrastive learning based approach to learn whole-genome sequence embeddings that can take account of phage-host interactions (PHIs). Contrastive learning is used to make phages infecting the same hosts close to each other in the new representation space. Specifically, we rephrase whole-genome sequences with frequency chaos game representation (FCGR) and learn latent embeddings that 'encapsulate' phages and host relationships through contrastive learning. The contrastive learning method works well on the imbalanced dataset. Based on the learned embeddings, a proposed pipeline named CL4PHI can predict known hosts and unseen hosts in training. We compare our method with two recently proposed state-of-the-art learning-based methods on their benchmark datasets. The experiment results demonstrate that the proposed method using contrastive learning improves the prediction accuracy on known hosts and demonstrates a zero-shot prediction capability on unseen hosts. In terms of potential applications, the rapid pace of genome sequencing across different species has resulted in a vast amount of whole-genome sequencing data that require efficient computational methods for identifying phage-host interactions. The proposed approach is expected to address this need by efficiently processing whole-genome sequences of phages and prokaryotic hosts and capturing features related to phage-host relationships for genome sequence representation. This approach can be used to accelerate the discovery of phage-host interactions and aid in the development of phage-based therapies for infectious diseases.


Asunto(s)
Bacteriófagos , Bacteriófagos/genética , Genoma Viral , Secuenciación Completa del Genoma , Mapeo Cromosómico
7.
Drug Resist Updat ; 77: 101124, 2024 Aug 02.
Artículo en Inglés | MEDLINE | ID: mdl-39128195

RESUMEN

BACKGROUND: Klebsiella pneumoniae (Kp) is a common community-acquired and nosocomial pathogen. Carbapenem-resistant and hypervirulent (CR-hvKp) variants can emerge rapidly within healthcare facilities and impacted by other infectious agents such as COVID-19 virus. METHODS: To understand the impact of COVID-19 virus on the prevalence of CR-hvKp, we accessed Kp genomes with corresponding metadata from GenBank. Sequence types (STs), antimicrobial resistance genes, and virulence genes, and those scores and CR-hvKp were identified. We analyzed population diversity and phylogenetic characteristics of five most common STs, measured the prevalence of CR-hvKp, identified CR-hvKp subtypes, and determined associations between carbapenem resistance gene subtypes with STs and plasmid types. These variables were compared pre- and during the COVID-19 pandemic. FINDINGS: The proportion of CR-hvKp isolates increased within multiple STs in different continents during the COVID-19 pandemic and persistent CR-hvKp subtypes were found in common STs. blaKPC was dominant in CG258, blaKPC-2 was detected in 97 % of the ST11 CR-hvKp, blaNDM subtypes were prominent in ST147 (87.4 %) and ST307 (70.8 %); blaOXA-48 and its subtypes were prevalent in ST15 (80.5 %). The possession of carbapenemase genes was different among subclades from different origins in different periods of time within each ST. IncFIB/IncHI1B hybrid plasmids contained virulence genes and carbapenemase genes and were predominant in ST147 (67.37 %) and ST307 (56.25 %). INTERPRETATION: The prevalence of CR-hvKp increased during the COVID-19 pandemic, which was evident by an increase in local endemic clones. This process was facilitated by the convergence of plasmids containing carbapenemase genes and virulence genes. These findings have implications for the appropriate use of antimicrobials and infection prevention and control during outbreaks of respiratory viruses and pandemic management.

8.
Proc Natl Acad Sci U S A ; 119(45): e2207022119, 2022 Nov 08.
Artículo en Inglés | MEDLINE | ID: mdl-36322726

RESUMEN

Spatially targeted interventions may be effective alternatives to individual or population-based prevention strategies against tuberculosis (TB). However, their efficacy may depend on the mechanisms that lead to geographically constrained hotspots. Local TB incidence may reflect high levels of local transmission; conversely, they may point to frequent travel of community members to high-risk areas. We used whole-genome sequencing to explore patterns of TB incidence and transmission in Lima, Peru. Between 2009 and 2012, we recruited incident pulmonary TB patients and their household contacts, whom we followed for the occurrence of TB disease. We used whole-genome sequences of 2,712 Mycobacterial tuberculosis isolates from 2,440 patients to estimate pariwise genomic distances and compared these to the spatial distance between patients' residences. Genomic distances increased rapidly as spatial distances increased and remained high beyond 2 km of separation. Next, we divided the study catchment area into 1 × 1 km grid-cell surface units and used household spatial coordinates to locate each TB patient to a specific cell. We estimated cell-specific transmission by calculating the proportion of patients in each cell with a pairwise genomic distance of 10 or fewer single-nucleotide polymorphisms. We found that cell-specific TB incidence and local transmission varied widely but that cell-specific TB incidence did not correlate closely with our estimates of local transmission (Cohen's k = 0.27). These findings indicate that an understanding of the spatial heterogeneity in the relative proportion of TB due to local transmission may help guide the implementation of spatially targeted interventions.


Asunto(s)
Mycobacterium tuberculosis , Tuberculosis Pulmonar , Tuberculosis , Humanos , Perú/epidemiología , Tuberculosis/epidemiología , Mycobacterium tuberculosis/genética , Tuberculosis Pulmonar/epidemiología , Secuenciación Completa del Genoma
9.
BMC Genomics ; 25(1): 91, 2024 Jan 22.
Artículo en Inglés | MEDLINE | ID: mdl-38253995

RESUMEN

BACKGROUND: Spodoptera litura is a harmful pest that feeds on more than 80 species of plants, and can be infected and killed by Spodoptera litura nucleopolyhedrovirus (SpltNPV). SpltNPV-C3 is a type C SpltNPV clone, that was observed and collected in Japan. Compared with type A or type B SpltNPVs, SpltNPV-C3 can cause the rapid mortality of S. litura larvae. METHODS: In this study, occlusion bodies (OBs) and occlusion-derived viruses (ODVs) of SpltNPV-C3 were purified, and OBs were observed by scanning electron microscopy (SEM). ODVs were observed under a transmission electron microscope (TEM). RESULTS: Both OBs and ODVs exhibit morphological characteristics typical of nucleopolyhedroviruses (NPVs).The genome of SpltNPV-C3 was sequenced and analyzed; the total length was 148,634 bp (GenBank accession 780,426,which was submitted as SpltNPV-II), with a G + C content of 45%. A total of 149 predicted ORFs were found. A phylogenetic tree of 90 baculoviruses was constructed based on core baculovirus genes. LC‒MS/MS was used to analyze the proteins of SpltNPV-C3; 34 proteins were found in the purified ODVs, 15 of which were core proteins. The structure of the complexes formed by per os infectivity factors 1, 2, 3 and 4 (PIF-1, PIF-2, PIF-3 and PIF-4) was predicted with the help of the AlphaFold multimer tool and predicted conserved sequences in PIF-3. SpltNPV-C3 is a valuable species because of its virulence, and the analysis of its genome and proteins in this research will be beneficial for pest control efforts.


Asunto(s)
Nucleopoliedrovirus , Proteoma , Animales , Nucleopoliedrovirus/genética , Spodoptera , Cromatografía Liquida , Filogenia , Espectrometría de Masas en Tándem , Baculoviridae
10.
BMC Genomics ; 25(1): 136, 2024 Feb 02.
Artículo en Inglés | MEDLINE | ID: mdl-38308218

RESUMEN

Microbial remediation of heavy metal polluted environment is ecofriendly and cost effective. Therefore, in the present study, Shewanella putrefaciens stain 4H was previously isolated by our group from the activated sludge of secondary sedimentation tank in a dyeing wastewater treatment plant. The bacterium was able to reduce chromate effectively. The strains showed significant ability to reduce Cr(VI) in the pH range of 8.0 to 10.0 (optimum pH 9.0) and 25-42 ℃ (optimum 30 ℃) and were able to reduce 300 mg/L of Cr(VI) in 72 h under parthenogenetic anaerobic conditions. In this paper, the complete genome sequence was obtained by Nanopore sequencing technology and analyzed chromium metabolism-related genes by comparative genomics The genomic sequence of S. putrefaciens 4H has a length of 4,631,110 bp with a G + C content of 44.66% and contains 4015 protein-coding genes and 3223,  2414, 2343 genes were correspondingly annotated into the COG, KEGG, and GO databases. The qRT-PCR analysis showed that the expression of chrA, mtrC, and undA genes was up-regulated under Cr(VI) stress. This study explores the Chromium Metabolism-Related Genes of S. putrefaciens 4H and will help to deepen our understanding of the mechanisms of Cr(VI) tolerance and reduction in this strain, thus contributing to the better application of S. putrefaciens 4H in the field of remediation of chromium-contaminated environments.


Asunto(s)
Shewanella putrefaciens , Shewanella putrefaciens/genética , Shewanella putrefaciens/metabolismo , Oxidación-Reducción , Cromo/toxicidad , Cromo/metabolismo , Bacterias/metabolismo
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA