Pesquisa | BVS - MINISTÉRIO DA SAÚDE

1.

The benefit of a complete reference genome for cancer structural variant analysis.

Paulin, Luis F; Fan, Jeremy; O'Neill, Kieran; Pleasance, Erin; Porter, Vanessa L; Jones, Steven J M; Sedlazeck, Fritz J.

medRxiv ; 2024 Mar 18.

Artigo em Inglês | MEDLINE | ID: mdl-38562786

RESUMO

The complexities of cancer genomes are becoming more easily interpreted due to advancements in sequencing technologies and improved bioinformatic analysis. Structural variants (SVs) represent an important subset of somatic events in tumors. While detection of SVs has been markedly improved by the development of long-read sequencing, somatic variant identification and annotation remains challenging. We hypothesized that use of a completed human reference genome (CHM13-T2T) would improve somatic SV calling. Our findings in a tumour/normal matched benchmark sample and two patient samples show that the CHM13-T2T improves SV detection and prioritization accuracy compared to GRCh38, with a notable reduction in false positive calls. We also overcame the lack of annotation resources for CHM13-T2T by lifting over CHM13-T2T-aligned reads to the GRCh38 genome, therefore combining both improved alignment and advanced annotations. In this process, we assessed the current SV benchmark set for COLO829/COLO829BL across four replicates sequenced at different centers with different long-read technologies. We discovered instability of this cell line across these replicates; 346 SVs (1.13%) were only discoverable in a single replicate. We identify 49 somatic SVs, which appear to be stable as they are consistently present across the four replicates. As such, we propose this consensus set as an updated benchmark for somatic SV calling and include both GRCh38 and CHM13-T2T coordinates in our benchmark. The benchmark is available at: 10.5281/zenodo.10819636 Our work demonstrates new approaches to optimize somatic SV prioritization in cancer with potential improvements in other genetic diseases.

2.

Detection of mosaic and population-level structural variants with Sniffles2.

Smolka, Moritz; Paulin, Luis F; Grochowski, Christopher M; Horner, Dominic W; Mahmoud, Medhat; Behera, Sairam; Kalef-Ezra, Ester; Gandhi, Mira; Hong, Karl; Pehlivan, Davut; Scholz, Sonja W; Carvalho, Claudia M B; Proukakis, Christos; Sedlazeck, Fritz J.

Nat Biotechnol ; 2024 Jan 02.

Artigo em Inglês | MEDLINE | ID: mdl-38168980

RESUMO

Calling structural variations (SVs) is technically challenging, but using long reads remains the most accurate way to identify complex genomic alterations. Here we present Sniffles2, which improves over current methods by implementing a repeat aware clustering coupled with a fast consensus sequence and coverage-adaptive filtering. Sniffles2 is 11.8 times faster and 29% more accurate than state-of-the-art SV callers across different coverages (5-50×), sequencing technologies (ONT and HiFi) and SV types. Furthermore, Sniffles2 solves the problem of family-level to population-level SV calling to produce fully genotyped VCF files. Across 11 probands, we accurately identified causative SVs around MECP2, including highly complex alleles with three overlapping SVs. Sniffles2 also enables the detection of mosaic SVs in bulk long-read data. As a result, we identified multiple mosaic SVs in brain tissue from a patient with multiple system atrophy. The identified SV showed a remarkable diversity within the cingulate cortex, impacting both genes involved in neuron function and repetitive elements.

3.

Publisher Correction: Detection of mosaic and population-level structural variants with Sniffles2.

Smolka, Moritz; Paulin, Luis F; Grochowski, Christopher M; Horner, Dominic W; Mahmoud, Medhat; Behera, Sairam; Kalef-Ezra, Ester; Gandhi, Mira; Hong, Karl; Pehlivan, Davut; Scholz, Sonja W; Carvalho, Claudia M B; Proukakis, Christos; Sedlazeck, Fritz J.

Nat Biotechnol ; 2024 Jan 22.

Artigo em Inglês | MEDLINE | ID: mdl-38253882

4.

Improved sequence mapping using a complete reference genome and lift-over.

Chen, Nae-Chyun; Paulin, Luis F; Sedlazeck, Fritz J; Koren, Sergey; Phillippy, Adam M; Langmead, Ben.

Nat Methods ; 21(1): 41-49, 2024 Jan.

Artigo em Inglês | MEDLINE | ID: mdl-38036856

RESUMO

Complete, telomere-to-telomere (T2T) genome assemblies promise improved analyses and the discovery of new variants, but many essential genomic resources remain associated with older reference genomes. Thus, there is a need to translate genomic features and read alignments between references. Here we describe a method called levioSAM2 that performs fast and accurate lift-over between assemblies using a whole-genome map. In addition to enabling the use of several references, we demonstrate that aligning reads to a high-quality reference (for example, T2T-CHM13) and lifting to an older reference (for example, Genome reference Consortium (GRC)h38) improves the accuracy of the resulting variant calls on the old reference. By leveraging the quality improvements of T2T-CHM13, levioSAM2 reduces small and structural variant calling errors compared with GRC-based mapping using real short- and long-read datasets. Performance is especially improved for a set of complex medically relevant genes, where the GRC references are lower quality.

Assuntos

Genoma , Genômica , Análise de Sequência de DNA/métodos , Genômica/métodos , Mapeamento Cromossômico , Sequenciamento de Nucleotídeos em Larga Escala

5.

Break-induced replication underlies formation of inverted triplications and generates unexpected diversity in haplotype structures.

Grochowski, Christopher M; Bengtsson, Jesse D; Du, Haowei; Gandhi, Mira; Lun, Ming Yin; Mehaffey, Michele G; Park, KyungHee; Höps, Wolfram; Benito-Garagorri, Eva; Hasenfeld, Patrick; Korbel, Jan O; Mahmoud, Medhat; Paulin, Luis F; Jhangiani, Shalini N; Muzny, Donna M; Fatih, Jawid M; Gibbs, Richard A; Pendleton, Matthew; Harrington, Eoghan; Juul, Sissel; Lindstrand, Anna; Sedlazeck, Fritz J; Pehlivan, Davut; Lupski, James R; Carvalho, Claudia M B.

bioRxiv ; 2023 Oct 03.

Artigo em Inglês | MEDLINE | ID: mdl-37873367

RESUMO

Background: The duplication-triplication/inverted-duplication (DUP-TRP/INV-DUP) structure is a type of complex genomic rearrangement (CGR) hypothesized to result from replicative repair of DNA due to replication fork collapse. It is often mediated by a pair of inverted low-copy repeats (LCR) followed by iterative template switches resulting in at least two breakpoint junctions in cis . Although it has been identified as an important mutation signature of pathogenicity for genomic disorders and cancer genomes, its architecture remains unresolved and is predicted to display at least four structural variation (SV) haplotypes. Results: Here we studied the genomic architecture of DUP-TRP/INV-DUP by investigating the genomic DNA of 24 patients with neurodevelopmental disorders identified by array comparative genomic hybridization (aCGH) on whom we found evidence for the existence of 4 out of 4 predicted SV haplotypes. Using a combination of short-read genome sequencing (GS), long- read GS, optical genome mapping and StrandSeq the haplotype structure was resolved in 18 samples. This approach refined the point of template switching between inverted LCRs in 4 samples revealing a DNA segment of â¼2.2-5.5 kb of 100% nucleotide similarity. A prediction model was developed to infer the LCR used to mediate the non-allelic homology repair. Conclusions: These data provide experimental evidence supporting the hypothesis that inverted LCRs act as a recombinant substrate in replication-based repair mechanisms. Such inverted repeats are particularly relevant for formation of copy-number associated inversions, including the DUP-TRP/INV-DUP structures. Moreover, this type of CGR can result in multiple conformers which contributes to generate diverse SV haplotypes in susceptible loci .

6.

FixItFelix: improving genomic analysis by fixing reference errors.

Behera, Sairam; LeFaive, Jonathon; Orchard, Peter; Mahmoud, Medhat; Paulin, Luis F; Farek, Jesse; Soto, Daniela C; Parker, Stephen C J; Smith, Albert V; Dennis, Megan Y; Zook, Justin M; Sedlazeck, Fritz J.

Genome Biol ; 24(1): 31, 2023 02 21.

Artigo em Inglês | MEDLINE | ID: mdl-36810122

RESUMO

The current version of the human reference genome, GRCh38, contains a number of errors including 1.2 Mbp of falsely duplicated and 8.04 Mbp of collapsed regions. These errors impact the variant calling of 33 protein-coding genes, including 12 with medical relevance. Here, we present FixItFelix, an efficient remapping approach, together with a modified version of the GRCh38 reference genome that improves the subsequent analysis across these genes within minutes for an existing alignment file while maintaining the same coordinates. We showcase these improvements over multi-ethnic control samples, demonstrating improvements for population variant calling as well as eQTL studies.

Assuntos

Genoma Humano , Genômica , Humanos , Sequenciamento de Nucleotídeos em Larga Escala , Análise de Sequência de DNA

7.

SVhound: detection of regions that harbor yet undetected structural variation.

Paulin, Luis F; Raveendran, Muthuswamy; Harris, R Alan; Rogers, Jeffrey; von Haeseler, Arndt; Sedlazeck, Fritz J.

BMC Bioinformatics ; 24(1): 23, 2023 Jan 20.

Artigo em Inglês | MEDLINE | ID: mdl-36670361

RESUMO

BACKGROUND: Recent population studies are ever growing in number of samples to investigate the diversity of a population or species. These studies reveal new polymorphism that lead to important insights into the mechanisms of evolution, but are also important for the interpretation of these variations. Nevertheless, while the full catalog of variations across entire species remains unknown, we can predict which regions harbor additional not yet detected variations and investigate their properties, thereby enhancing the analysis for potentially missed variants. RESULTS: To achieve this we developed SVhound ( https://github.com/lfpaulin/SVhound ), which based on a population level SVs dataset can predict regions that harbor unseen SV alleles. We tested SVhound using subsets of the 1000 genomes project data and showed that its correlation (average correlation of 2800 tests r = 0.7136) is high to the full data set. Next, we utilized SVhound to investigate potentially missed or understudied regions across 1KGP and CCDG. Lastly we also apply SVhound on a small and novel SV call set for rhesus macaque (Macaca mulatta) and discuss the impact and choice of parameters for SVhound. CONCLUSIONS: SVhound is a unique method to identify potential regions that harbor hidden diversity in model and non model organisms and can also be potentially used to ensure high quality of SV call sets.

Assuntos

Variação Estrutural do Genoma , Polimorfismo Genético , Software , Animais , Humanos , Alelos , Genoma Humano , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Macaca mulatta/genética

8.

The third international hackathon for applying insights into large-scale genomic composition to use cases in a wide range of organisms.

Walker, Kimberly; Kalra, Divya; Lowdon, Rebecca; Chen, Guangyi; Molik, David; Soto, Daniela C; Dabbaghie, Fawaz; Khleifat, Ahmad Al; Mahmoud, Medhat; Paulin, Luis F; Raza, Muhammad Sohail; Pfeifer, Susanne P; Agustinho, Daniel Paiva; Aliyev, Elbay; Avdeyev, Pavel; Barrozo, Enrico R; Behera, Sairam; Billingsley, Kimberley; Chong, Li Chuin; Choubey, Deepak; De Coster, Wouter; Fu, Yilei; Gener, Alejandro R; Hefferon, Timothy; Henke, David Morgan; Höps, Wolfram; Illarionova, Anastasia; Jochum, Michael D; Jose, Maria; Kesharwani, Rupesh K; Kolora, Sree Rohit Raj; Kubica, Jedrzej; Lakra, Priya; Lattimer, Damaris; Liew, Chia-Sin; Lo, Bai-Wei; Lo, Chunhsuan; Lötter, Anneri; Majidian, Sina; Mendem, Suresh Kumar; Mondal, Rajarshi; Ohmiya, Hiroko; Parvin, Nasrin; Peralta, Carolina; Poon, Chi-Lam; Prabhakaran, Ramanandan; Saitou, Marie; Sammi, Aditi; Sanio, Philippe; Sapoval, Nicolae.

F1000Res ; 11: 530, 2022.

Artigo em Inglês | MEDLINE | ID: mdl-36262335

RESUMO

In October 2021, 59 scientists from 14 countries and 13 U.S. states collaborated virtually in the Third Annual Baylor College of Medicine & DNANexus Structural Variation hackathon. The goal of the hackathon was to advance research on structural variants (SVs) by prototyping and iterating on open-source software. This led to nine hackathon projects focused on diverse genomics research interests, including various SV discovery and genotyping methods, SV sequence reconstruction, and clinically relevant structural variation, including SARS-CoV-2 variants. Repositories for the projects that participated in the hackathon are available at https://github.com/collaborativebioinformatics.

Assuntos

COVID-19 , SARS-CoV-2 , Humanos , SARS-CoV-2/genética , Genômica , Software

9.

ATM controls meiotic DNA double-strand break formation and recombination and affects synaptonemal complex organization in plants.

Kurzbauer, Marie-Therese; Janisiw, Michael Peter; Paulin, Luis F; Prusén Mota, Ignacio; Tomanov, Konstantin; Krsicka, Ondrej; Haeseler, Arndt von; Schubert, Veit; Schlögelhofer, Peter.

Plant Cell ; 33(5): 1633-1656, 2021 07 02.

Artigo em Inglês | MEDLINE | ID: mdl-33659989

RESUMO

Meiosis is a specialized cell division that gives rise to genetically distinct gametic cells. Meiosis relies on the tightly controlled formation of DNA double-strand breaks (DSBs) and their repair via homologous recombination for correct chromosome segregation. Like all forms of DNA damage, meiotic DSBs are potentially harmful and their formation activates an elaborate response to inhibit excessive DNA break formation and ensure successful repair. Previous studies established the protein kinase ATM as a DSB sensor and meiotic regulator in several organisms. Here we show that Arabidopsis ATM acts at multiple steps during DSB formation and processing, as well as crossover (CO) formation and synaptonemal complex (SC) organization, all vital for the successful completion of meiosis. We developed a single-molecule approach to quantify meiotic breaks and determined that ATM is essential to limit the number of meiotic DSBs. Local and genome-wide recombination screens showed that ATM restricts the number of interference-insensitive COs, while super-resolution STED nanoscopy of meiotic chromosomes revealed that the kinase affects chromatin loop size and SC length and width. Our study extends our understanding of how ATM functions during plant meiosis and establishes it as an integral factor of the meiotic program.

Assuntos

Arabidopsis/metabolismo , Quebras de DNA de Cadeia Dupla , Meiose , Recombinação Genética/genética , Complexo Sinaptonêmico/metabolismo , Proteínas de Arabidopsis/genética , Proteínas de Arabidopsis/metabolismo , Proteínas Mutadas de Ataxia Telangiectasia/genética , Proteínas Mutadas de Ataxia Telangiectasia/metabolismo , Cromatina/metabolismo , Troca Genética , Reparo do DNA , Fertilidade , Mutação/genética , Recombinases/metabolismo

10.

Poly(ADP-ribose) glycohydrolase coordinates meiotic DNA double-strand break induction and repair independent of its catalytic activity.

Janisiw, Eva; Raices, Marilina; Balmir, Fabiola; Paulin, Luis F; Baudrimont, Antoine; von Haeseler, Arndt; Yanowitz, Judith L; Jantsch, Verena; Silva, Nicola.

Nat Commun ; 11(1): 4869, 2020 09 25.

Artigo em Inglês | MEDLINE | ID: mdl-32978394

RESUMO

Poly(ADP-ribosyl)ation is a reversible post-translational modification synthetized by ADP-ribose transferases and removed by poly(ADP-ribose) glycohydrolase (PARG), which plays important roles in DNA damage repair. While well-studied in somatic tissues, much less is known about poly(ADP-ribosyl)ation in the germline, where DNA double-strand breaks are introduced by a regulated program and repaired by crossover recombination to establish a tether between homologous chromosomes. The interaction between the parental chromosomes is facilitated by meiotic specific adaptation of the chromosome axes and cohesins, and reinforced by the synaptonemal complex. Here, we uncover an unexpected role for PARG in coordinating the induction of meiotic DNA breaks and their homologous recombination-mediated repair in Caenorhabditis elegans. PARG-1/PARG interacts with both axial and central elements of the synaptonemal complex, REC-8/Rec8 and the MRN/X complex. PARG-1 shapes the recombination landscape and reinforces the tightly regulated control of crossover numbers without requiring its catalytic activity. We unravel roles in regulating meiosis, beyond its enzymatic activity in poly(ADP-ribose) catabolism.

Assuntos

Caenorhabditis elegans/metabolismo , Quebras de DNA de Cadeia Dupla , Reparo do DNA/fisiologia , DNA/metabolismo , Glicosídeo Hidrolases/metabolismo , Animais , Caenorhabditis elegans/genética , Proteínas de Caenorhabditis elegans/genética , Proteínas de Caenorhabditis elegans/metabolismo , Núcleo Celular/metabolismo , Células Germinativas , Glicosídeo Hidrolases/genética , Proteínas Nucleares/genética , Proteínas Nucleares/metabolismo , Poli ADP Ribosilação , Poli Adenosina Difosfato Ribose/metabolismo , Processamento de Proteína Pós-Traducional

11.

PhyloFlu, a DNA microarray for determining the phylogenetic origin of influenza A virus gene segments and the genomic fingerprint of viral strains.

Paulin, Luis F; de los D Soto-Del Río, María; Sánchez, Iván; Hernández, Jesús; Gutiérrez-Ríos, Rosa M; López-Martínez, Irma; Wong-Chew, Rosa M; Parissi-Crivelli, Aurora; Isa, P; López, Susana; Arias, Carlos F.

J Clin Microbiol ; 52(3): 803-13, 2014 Mar.

Artigo em Inglês | MEDLINE | ID: mdl-24353006

RESUMO

Recent evidence suggests that most influenza A virus gene segments can contribute to the pathogenicity of the virus. In this regard, the hemagglutinin (HA) subtype of the circulating strains has been closely surveyed, but the reassortment of internal gene segments is usually not monitored as a potential source of an increased pathogenicity. In this work, an oligonucleotide DNA microarray (PhyloFlu) designed to determine the phylogenetic origins of the eight segments of the influenza virus genome was constructed and validated. Clades were defined for each segment and also for the 16 HA and 9 neuraminidase (NA) subtypes. Viral genetic material was amplified by reverse transcription-PCR (RT-PCR) with primers specific to the conserved 5' and 3' ends of the influenza A virus genes, followed by PCR amplification with random primers and Cy3 labeling. The microarray unambiguously determined the clades for all eight influenza virus genes in 74% (28/38) of the samples. The microarray was validated with reference strains from different animal origins, as well as from human, swine, and avian viruses from field or clinical samples. In most cases, the phylogenetic clade of each segment defined its animal host of origin. The genomic fingerprint deduced by the combined information of the individual clades allowed for the determination of the time and place that strains with the same genomic pattern were previously reported. PhyloFlu is useful for characterizing and surveying the genetic diversity and variation of animal viruses circulating in different environmental niches and for obtaining a more detailed surveillance and follow up of reassortant events that can potentially modify virus pathogenicity.

Assuntos

Impressões Digitais de DNA/métodos , Genoma Viral , Técnicas de Genotipagem/métodos , Vírus da Influenza A/classificação , Influenza Humana/virologia , Análise em Microsséries/métodos , Infecções por Paramyxoviridae/veterinária , Animais , Análise por Conglomerados , Humanos , Vírus da Influenza A/genética , Vírus da Influenza A/isolamento & purificação , Infecções por Paramyxoviridae/virologia , Filogenia , RNA Viral/genética , Reação em Cadeia da Polimerase Via Transcriptase Reversa , Virologia/métodos

RESUMO

RESUMO

RESUMO

Assuntos

RESUMO

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA