Pesquisa | Biblioteca Virtual em Saúde

sideRETRO: uma ferramenta de bioinformática dedicada à identificação deinserções polimórficas, germinativas ou somáticas, de pseudogenes processados / sideRETRO: a bioinformatics tool for identifying somatic and polymorphic insertions of processed pseudogenes

Miller, Thiago Luiz Araujo.

São Paulo; s.n; s.n; 2022. 186 p. tab, graf, ilus.

Tese em Português | LILACS | ID: biblio-1397348

RESUMO

Os avanços metodológicos e instrumentais decorrentes do Projeto Genoma Humano formaram o arcabouço necessário para o surgimento das tecnologias de sequenciamento de DNA de Nova Geração, as quais se caracterizam por um custo reduzido, uma baixa demanda operacional e a produção de um grande volume de dados por experimento. Concomitantemente a isso, o aumento no poder de processamento computacional permitiu o desenvolvimento de análises genéticas em larga escala, de modo que, atualmente, é possível estudar características genômicas individualizadas e, até então, pouco ou nunca exploradas. Dentre essas características, aquelas relacionadas às variações estruturais em genomas têm recebido bastante atenção. Os pseudogenes processados, ou retrocópias, são variações estruturais causadas pela duplicação de genes codificadores mediante à transposição de seu RNA mensageiro maduro pela maquinaria enzimática de LINE- 1. As retrocópias podem estar fixadas, ou seja, presentes em todos os genomas de uma dada espécie, os quais são representados pela montagem modelo do genoma de referência, ou podem não estar fixadas, sendo polimórficas, germinativas ou somáticas. No entanto, o conhecimento acerca das retrocópias não fixadas ainda é limitado devido à falta de ferramentas de bioinformática dedicadas a sua identificação e anotação em dados de sequenciamento de DNA. Posto isso, este trabalho apresenta o sideRETRO um programa computacional especializado na detecção de pseudogenes processados ausentes do genoma de referência, mas presentes em dados de sequenciamento de genoma completo e exoma de outros indivíduos. Além de apontar para a presença de retrocópias não fixadas, o sideRETRO é capaz de anotar várias outras características relacionadas a esses evento, tais como: a coordenada genômica de inserção do pseudogene processado, a qual constitui o cromossomo, o ponto de inserção e a fita de DNA (líder or retardada); o contexto genômico do evento (exônico, intrônico ou intergênico); a genotipagem (presente ou ausente) e a haplotipagem (em homozigose ou heterozigose). Para atestar a eficiência da ferramenta, o sideRETRO foi executado para dados simulados e para dados reais validados experimentalmente por um grupo independente. Portanto, em resumo, nesta tese são descritos o desenvolvimento e o uso do sideRETRO uma ferramenta computacional robusta e eficiente, designada para identificar e anotar pseudogenes processados não fixados. Por fim, vale destacar que o sideRETRO preenche uma lacuna metodológica e possibilita novas hipóteses e investigações sistemáticas no campo de chamada de variantes estruturais

The methodological and instrumental advances resulting from the Human Genome Project have created the necessary framework to the emergence of Next Generation DNA sequencing technologies, which are characterized by a reduced cost, low operational demand and the generation of a large volume of data per experiment. Concomitantly with this, the increase in computational processing power has driven the development of large-scale genetic analyses, which allowed us to study individualized genomic traits little or never explored before. Among these characteristics, those related to structural variations in genomes have received much attention. Processed pseudogenes, or retrocopies, are structural variations caused by the duplication of coding genes through the transposition of their mature messenger RNA by the LINE-1 enzymatic machinery. Retrocopies can be fixed (i.e., present in all genomes of a given species and included into the assembly of the reference genome) or unfixed, being polymorphic, germinal or somatic. However, knowledge about unfixed retrocopies is still limited due to the lack of bioinformatics tools dedicated to their identification and annotation in DNA sequencing data. Therefore, this work presents sideRETRO a computer program specialized in the detection of processed pseudogenes absent from the reference genome, but present in whole genome and exome sequencing data from other individuals. In addition to pointing out the presence of unfixed retrocopies, sideRETRO is able to annotate several other characteristics related to these events, such as: the genomic coordinate of the processed pseudogene insetion, which constitutes the chromosome, the insertion point and the DNA strand (leader or retard); the genomic context of the event (exonic, intronic or intergenic); genotyping (present or absent) and haplotyping (homozygous or heterozygous). To certify the sideRETRO efficiency, it was run on simulated data and on real data experimentally validated by an independent group. Therefore, in summary, this thesis describes the development and use of sideRETRO a robust and efficient computational tool, designed to identify and annotate unfixed processed pseudogenes. Finally, it is worth noting that sideRETRO fills a methodological gap and allows new hypotheses and systematic investigations in the field of structural variant calling

Assuntos

Polimorfismo Genético/genética , Biologia Computacional/classificação , Biologia Computacional/instrumentação , Custos e Análise de Custo , Genômica/instrumentação , Análise de Sequência de DNA/instrumentação , Codificação Clínica

Transposon insertion profiling by sequencing (TIPseq) for mapping LINE-1 insertions in the human genome.

Steranka, Jared P; Tang, Zuojian; Grivainis, Mark; Huang, Cheng Ran Lisa; Payer, Lindsay M; Rego, Fernanda O R; Miller, Thiago Luiz Araujo; Galante, Pedro A F; Ramaswami, Sitharam; Heguy, Adriana; Fenyö, David; Boeke, Jef D; Burns, Kathleen H.

Mob DNA ; 10: 8, 2019.

Artigo em Inglês | MEDLINE | ID: mdl-30899333

RESUMO

BACKGROUND: Transposable elements make up a significant portion of the human genome. Accurately locating these mobile DNAs is vital to understand their role as a source of structural variation and somatic mutation. To this end, laboratories have developed strategies to selectively amplify or otherwise enrich transposable element insertion sites in genomic DNA. RESULTS: Here we describe a technique, Transposon Insertion Profiling by sequencing (TIPseq), to map Long INterspersed Element 1 (LINE-1, L1) retrotransposon insertions in the human genome. This method uses vectorette PCR to amplify species-specific L1 (L1PA1) insertion sites followed by paired-end Illumina sequencing. In addition to providing a step-by-step molecular biology protocol, we offer users a guide to our pipeline for data analysis, TIPseqHunter. Our recent studies in pancreatic and ovarian cancer demonstrate the ability of TIPseq to identify invariant (fixed), polymorphic (inherited variants), as well as somatically-acquired L1 insertions that distinguish cancer genomes from a patient's constitutional make-up. CONCLUSIONS: TIPseq provides an approach for amplifying evolutionarily young, active transposable element insertion sites from genomic DNA. Our rationale and variations on this protocol may be useful to those mapping L1 and other mobile elements in complex genomes.

Measuring plasma levels of three microRNAs can improve the accuracy for identification of malignant breast lesions in women with BI-RADS 4 mammography.

Pezuk, Julia Alejandra; Miller, Thiago Luiz Araujo; Bevilacqua, José Luiz Barbosa; de Barros, Alfredo Carlos Simões Dornellas; de Andrade, Felipe Eduardo Martins; E Macedo, Luiza Freire de Andrade; Aguilar, Vera; Claro, Amanda Natasha Menardo; Camargo, Anamaria Aranha; Galante, Pedro Alexandre Favoretto; Reis, Luiz F L.

Oncotarget ; 8(48): 83940-83948, 2017 Oct 13.

Artigo em Inglês | MEDLINE | ID: mdl-29137394

RESUMO

A BI-RADS category of 4 from a mammogram indicates suspicious breast lesions, which require core biopsies for diagnosis and have an approximately one third chance of being malignant. Human plasma contains many circulating microRNAs, and variations in their circulating levels have been associated with pathologies, including cancer. Here, we present a novel methodology to identify malignant breast lesions in women with BI-RADS 4 mammography. First, we used the miRNome array and qRT-PCR to define circulating microRNAs that were differentially represented in blood samples from women with breast tumor (BI-RADS 5 or 6) in comparison to controls (BI-RADS 1 or 2). Next, we used qRT-PCR to quantify the level of this circulating microRNAs in patients with mammograms presenting with BI-RADS category 4. Finally, we developed a machine learning method (Artificial Neural Network - ANN) that receives circulating microRNA levels and automatically classifies BI-RADS 4 breast lesions as malignant or benign. We identified a minimum set of three circulating miRNAs (miR-15a, miR-101 and miR-144) with altered levels in patients with breast cancer. These three miRNAs were quantified in plasma from 60 patients presenting biopsy-proven BI-RADS 4 lesions. Finally, we constructed a very efficient ANN that could correctly classify BI-RADS 4 lesions as malignant or benign with approximately 92.5% accuracy, 95% specificity and 88% sensibility. We believe that our strategy of using circulating microRNA and a machine learning method to classify BI-RADS 4 breast lesions is a non-invasive, non-stressful and valuable complementary approach to core biopsy in women with BI-RADS 4 lesions.

RESUMO

Assuntos

RESUMO

RESUMO

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA