Pesquisa | BVS Doenças Infecciosas e Parasitárias

1.

Personal omics profiling reveals dynamic molecular and medical phenotypes.

Chen, Rui; Mias, George I; Li-Pook-Than, Jennifer; Jiang, Lihua; Lam, Hugo Y K; Chen, Rong; Miriami, Elana; Karczewski, Konrad J; Hariharan, Manoj; Dewey, Frederick E; Cheng, Yong; Clark, Michael J; Im, Hogune; Habegger, Lukas; Balasubramanian, Suganthi; O'Huallachain, Maeve; Dudley, Joel T; Hillenmeyer, Sara; Haraksingh, Rajini; Sharon, Donald; Euskirchen, Ghia; Lacroute, Phil; Bettinger, Keith; Boyle, Alan P; Kasowski, Maya; Grubert, Fabian; Seki, Scott; Garcia, Marco; Whirl-Carrillo, Michelle; Gallardo, Mercedes; Blasco, Maria A; Greenberg, Peter L; Snyder, Phyllis; Klein, Teri E; Altman, Russ B; Butte, Atul J; Ashley, Euan A; Gerstein, Mark; Nadeau, Kari C; Tang, Hua; Snyder, Michael.

Cell ; 148(6): 1293-307, 2012 Mar 16.

Artigo em Inglês | MEDLINE | ID: mdl-22424236

RESUMO

Personalized medicine is expected to benefit from combining genomic information with regular monitoring of physiological states by multiple high-throughput methods. Here, we present an integrative personal omics profile (iPOP), an analysis that combines genomic, transcriptomic, proteomic, metabolomic, and autoantibody profiles from a single individual over a 14 month period. Our iPOP analysis revealed various medical risks, including type 2 diabetes. It also uncovered extensive, dynamic changes in diverse molecular components and biological pathways across healthy and diseased conditions. Extremely high-coverage genomic and transcriptomic data, which provide the basis of our iPOP, revealed extensive heteroallelic changes during healthy and diseased states and an unexpected RNA editing mechanism. This study demonstrates that longitudinal iPOP can be used to interpret healthy and diseased states by connecting genomic information with additional dynamic omics activity.

Assuntos

Genoma Humano , Genômica , Medicina de Precisão , Diabetes Mellitus Tipo 2/genética , Feminino , Perfilação da Expressão Gênica , Humanos , Masculino , Metabolômica , Pessoa de Meia-Idade , Mutação , Proteômica , Vírus Sinciciais Respiratórios/isolamento & purificação , Rhinovirus/isolamento & purificação

2.

Insights from incorporating quantum computing into drug design workflows.

Lau, Bayo; Emani, Prashant S; Chapman, Jackson; Yao, Lijing; Lam, Tarsus; Merrill, Paul; Warrell, Jonathan; Gerstein, Mark B; Lam, Hugo Y K.

Bioinformatics ; 39(1)2023 01 01.

Artigo em Inglês | MEDLINE | ID: mdl-36477833

RESUMO

MOTIVATION: While many quantum computing (QC) methods promise theoretical advantages over classical counterparts, quantum hardware remains limited. Exploiting near-term QC in computer-aided drug design (CADD) thus requires judicious partitioning between classical and quantum calculations. RESULTS: We present HypaCADD, a hybrid classical-quantum workflow for finding ligands binding to proteins, while accounting for genetic mutations. We explicitly identify modules of our drug-design workflow currently amenable to replacement by QC: non-intuitively, we identify the mutation-impact predictor as the best candidate. HypaCADD thus combines classical docking and molecular dynamics with quantum machine learning (QML) to infer the impact of mutations. We present a case study with the coronavirus (SARS-CoV-2) protease and associated mutants. We map a classical machine-learning module onto QC, using a neural network constructed from qubit-rotation gates. We have implemented this in simulation and on two commercial quantum computers. We find that the QML models can perform on par with, if not better than, classical baselines. In summary, HypaCADD offers a successful strategy for leveraging QC for CADD. AVAILABILITY AND IMPLEMENTATION: Jupyter Notebooks with Python code are freely available for academic use on GitHub: https://www.github.com/hypahub/hypacadd_notebook. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

Assuntos

COVID-19 , Software , Humanos , Fluxo de Trabalho , Metodologias Computacionais , Teoria Quântica , SARS-CoV-2 , Desenho de Fármacos , Simulação de Dinâmica Molecular

3.

Whole-genome sequencing of Atacama skeleton shows novel mutations linked with dysplasia.

Bhattacharya, Sanchita; Li, Jian; Sockell, Alexandra; Kan, Matthew J; Bava, Felice A; Chen, Shann-Ching; Ávila-Arcos, María C; Ji, Xuhuai; Smith, Emery; Asadi, Narges B; Lachman, Ralph S; Lam, Hugo Y K; Bustamante, Carlos D; Butte, Atul J; Nolan, Garry P.

Genome Res ; 28(4): 423-431, 2018 04.

Artigo em Inglês | MEDLINE | ID: mdl-29567674

RESUMO

Over a decade ago, the Atacama humanoid skeleton (Ata) was discovered in the Atacama region of Chile. The Ata specimen carried a strange phenotype-6-in stature, fewer than expected ribs, elongated cranium, and accelerated bone age-leading to speculation that this was a preserved nonhuman primate, human fetus harboring genetic mutations, or even an extraterrestrial. We previously reported that it was human by DNA analysis with an estimated bone age of about 6-8 yr at the time of demise. To determine the possible genetic drivers of the observed morphology, DNA from the specimen was subjected to whole-genome sequencing using the Illumina HiSeq platform with an average 11.5× coverage of 101-bp, paired-end reads. In total, 3,356,569 single nucleotide variations (SNVs) were found as compared to the human reference genome, 518,365 insertions and deletions (indels), and 1047 structural variations (SVs) were detected. Here, we present the detailed whole-genome analysis showing that Ata is a female of human origin, likely of Chilean descent, and its genome harbors mutations in genes (COL1A1, COL2A1, KMT2D, FLNB, ATR, TRIP11, PCNT) previously linked with diseases of small stature, rib anomalies, cranial malformations, premature joint fusion, and osteochondrodysplasia (also known as skeletal dysplasia). Together, these findings provide a molecular characterization of Ata's peculiar phenotype, which likely results from multiple known and novel putative gene mutations affecting bone development and ossification.

Assuntos

DNA Antigo/análise , Genoma Humano/genética , Osteocondrodisplasias/genética , Sequenciamento Completo do Genoma , Animais , Feminino , Sequenciamento de Nucleotídeos em Larga Escala , Humanos , Mutação INDEL , Anotação de Sequência Molecular , Mutação/genética , Osteocondrodisplasias/fisiopatologia , Fenótipo , Polimorfismo de Nucleotídeo Único/genética

4.

An integrated map of structural variation in 2,504 human genomes.

Sudmant, Peter H; Rausch, Tobias; Gardner, Eugene J; Handsaker, Robert E; Abyzov, Alexej; Huddleston, John; Zhang, Yan; Ye, Kai; Jun, Goo; Fritz, Markus Hsi-Yang; Konkel, Miriam K; Malhotra, Ankit; Stütz, Adrian M; Shi, Xinghua; Casale, Francesco Paolo; Chen, Jieming; Hormozdiari, Fereydoun; Dayama, Gargi; Chen, Ken; Malig, Maika; Chaisson, Mark J P; Walter, Klaudia; Meiers, Sascha; Kashin, Seva; Garrison, Erik; Auton, Adam; Lam, Hugo Y K; Mu, Xinmeng Jasmine; Alkan, Can; Antaki, Danny; Bae, Taejeong; Cerveira, Eliza; Chines, Peter; Chong, Zechen; Clarke, Laura; Dal, Elif; Ding, Li; Emery, Sarah; Fan, Xian; Gujral, Madhusudan; Kahveci, Fatma; Kidd, Jeffrey M; Kong, Yu; Lameijer, Eric-Wubbo; McCarthy, Shane; Flicek, Paul; Gibbs, Richard A; Marth, Gabor; Mason, Christopher E; Menelaou, Androniki.

Nature ; 526(7571): 75-81, 2015 Oct 01.

Artigo em Inglês | MEDLINE | ID: mdl-26432246

RESUMO

Structural variants are implicated in numerous diseases and make up the majority of varying nucleotides among human genomes. Here we describe an integrated set of eight structural variant classes comprising both balanced and unbalanced variants, which we constructed using short-read DNA sequencing data and statistically phased onto haplotype blocks in 26 human populations. Analysing this set, we identify numerous gene-intersecting structural variants exhibiting population stratification and describe naturally occurring homozygous gene knockouts that suggest the dispensability of a variety of human genes. We demonstrate that structural variants are enriched on haplotypes identified by genome-wide association studies and exhibit enrichment for expression quantitative trait loci. Additionally, we uncover appreciable levels of structural variant complexity at different scales, including genic loci subject to clusters of repeated rearrangement and complex structural variants with multiple breakpoints likely to have formed through individual mutational events. Our catalogue will enhance future studies into structural variant demography, functional impact and disease association.

Assuntos

Variação Genética/genética , Genoma Humano/genética , Mapeamento Físico do Cromossomo , Sequência de Aminoácidos , Predisposição Genética para Doença , Genética Médica , Genética Populacional , Estudo de Associação Genômica Ampla , Genômica , Genótipo , Haplótipos/genética , Homozigoto , Humanos , Dados de Sequência Molecular , Taxa de Mutação , Polimorfismo de Nucleotídeo Único/genética , Locos de Características Quantitativas/genética , Análise de Sequência de DNA , Deleção de Sequência/genética

5.

Lessons from the CAGI-4 Hopkins clinical panel challenge.

Chandonia, John-Marc; Adhikari, Aashish; Carraro, Marco; Chhibber, Aparna; Cutting, Garry R; Fu, Yao; Gasparini, Alessandra; Jones, David T; Kramer, Andreas; Kundu, Kunal; Lam, Hugo Y K; Leonardi, Emanuela; Moult, John; Pal, Lipika R; Searls, David B; Shah, Sohela; Sunyaev, Shamil; Tosatto, Silvio C E; Yin, Yizhou; Buckley, Bethany A.

Hum Mutat ; 38(9): 1155-1168, 2017 09.

Artigo em Inglês | MEDLINE | ID: mdl-28397312

RESUMO

The CAGI-4 Hopkins clinical panel challenge was an attempt to assess state-of-the-art methods for clinical phenotype prediction from DNA sequence. Participants were provided with exonic sequences of 83 genes for 106 patients from the Johns Hopkins DNA Diagnostic Laboratory. Five groups participated in the challenge, predicting both the probability that each patient had each of the 14 possible classes of disease, as well as one or more causal variants. In cases where the Hopkins laboratory reported a variant, at least one predictor correctly identified the disease class in 36 of the 43 patients (84%). Even in cases where the Hopkins laboratory did not find a variant, at least one predictor correctly identified the class in 39 of the 63 patients (62%). Each prediction group correctly diagnosed at least one patient that was not successfully diagnosed by any other group. We discuss the causal variant predictions by different groups and their implications for further development of methods to assess variants of unknown significance. Our results suggest that clinically relevant variants may be missed when physicians order small panels targeted on a specific phenotype. We also quantify the false-positive rate of DNA-guided analysis in the absence of prior phenotypic indication.

Assuntos

Biologia Computacional/métodos , Análise de Sequência de DNA/métodos , Bases de Dados Genéticas , Predisposição Genética para Doença , Testes Genéticos , Humanos , Fenótipo

6.

LongISLND: in silico sequencing of lengthy and noisy datatypes.

Lau, Bayo; Mohiyuddin, Marghoob; Mu, John C; Fang, Li Tai; Bani Asadi, Narges; Dallett, Carolina; Lam, Hugo Y K.

Bioinformatics ; 32(24): 3829-3832, 2016 12 15.

Artigo em Inglês | MEDLINE | ID: mdl-27667791

RESUMO

LongISLND is a software package designed to simulate sequencing data according to the characteristics of third generation, single-molecule sequencing technologies. The general software architecture is easily extendable, as demonstrated by the emulation of Pacific Biosciences (PacBio) multi-pass sequencing with P5 and P6 chemistries, producing data in FASTQ, H5, and the latest PacBio BAM format. We demonstrate its utility by downstream processing with consensus building and variant calling. AVAILABILITY AND IMPLEMENTATION: LongISLND is implemented in Java and available at http://bioinform.github.io/longislnd CONTACT: hugo.lam@roche.comSupplementary information: Supplementary data are available at Bioinformatics online.

Assuntos

Biologia Computacional/métodos , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Software , Simulação por Computador , Alinhamento de Sequência

7.

Mapping copy number variation by population-scale genome sequencing.

Mills, Ryan E; Walter, Klaudia; Stewart, Chip; Handsaker, Robert E; Chen, Ken; Alkan, Can; Abyzov, Alexej; Yoon, Seungtai Chris; Ye, Kai; Cheetham, R Keira; Chinwalla, Asif; Conrad, Donald F; Fu, Yutao; Grubert, Fabian; Hajirasouliha, Iman; Hormozdiari, Fereydoun; Iakoucheva, Lilia M; Iqbal, Zamin; Kang, Shuli; Kidd, Jeffrey M; Konkel, Miriam K; Korn, Joshua; Khurana, Ekta; Kural, Deniz; Lam, Hugo Y K; Leng, Jing; Li, Ruiqiang; Li, Yingrui; Lin, Chang-Yun; Luo, Ruibang; Mu, Xinmeng Jasmine; Nemesh, James; Peckham, Heather E; Rausch, Tobias; Scally, Aylwyn; Shi, Xinghua; Stromberg, Michael P; Stütz, Adrian M; Urban, Alexander Eckehart; Walker, Jerilyn A; Wu, Jiantao; Zhang, Yujun; Zhang, Zhengdong D; Batzer, Mark A; Ding, Li; Marth, Gabor T; McVean, Gil; Sebat, Jonathan; Snyder, Michael; Wang, Jun.

Nature ; 470(7332): 59-65, 2011 Feb 03.

Artigo em Inglês | MEDLINE | ID: mdl-21293372

RESUMO

Genomic structural variants (SVs) are abundant in humans, differing from other forms of variation in extent, origin and functional impact. Despite progress in SV characterization, the nucleotide resolution architecture of most SVs remains unknown. We constructed a map of unbalanced SVs (that is, copy number variants) based on whole genome DNA sequencing data from 185 human genomes, integrating evidence from complementary SV discovery approaches with extensive experimental validations. Our map encompassed 22,025 deletions and 6,000 additional SVs, including insertions and tandem duplications. Most SVs (53%) were mapped to nucleotide resolution, which facilitated analysing their origin and functional impact. We examined numerous whole and partial gene deletions with a genotyping approach and observed a depletion of gene disruptions amongst high frequency deletions. Furthermore, we observed differences in the size spectra of SVs originating from distinct formation mechanisms, and constructed a map of SV hotspots formed by common mechanisms. Our analytical framework and SV map serves as a resource for sequencing-based association studies.

Assuntos

Variações do Número de Cópias de DNA/genética , Genética Populacional , Genoma Humano/genética , Genômica , Duplicação Gênica/genética , Predisposição Genética para Doença/genética , Genótipo , Humanos , Mutagênese Insercional/genética , Reprodutibilidade dos Testes , Análise de Sequência de DNA , Deleção de Sequência/genética

8.

svclassify: a method to establish benchmark structural variant calls.

Parikh, Hemang; Mohiyuddin, Marghoob; Lam, Hugo Y K; Iyer, Hariharan; Chen, Desu; Pratt, Mark; Bartha, Gabor; Spies, Noah; Losert, Wolfgang; Zook, Justin M; Salit, Marc.

BMC Genomics ; 17: 64, 2016 Jan 16.

Artigo em Inglês | MEDLINE | ID: mdl-26772178

RESUMO

BACKGROUND: The human genome contains variants ranging in size from small single nucleotide polymorphisms (SNPs) to large structural variants (SVs). High-quality benchmark small variant calls for the pilot National Institute of Standards and Technology (NIST) Reference Material (NA12878) have been developed by the Genome in a Bottle Consortium, but no similar high-quality benchmark SV calls exist for this genome. Since SV callers output highly discordant results, we developed methods to combine multiple forms of evidence from multiple sequencing technologies to classify candidate SVs into likely true or false positives. Our method (svclassify) calculates annotations from one or more aligned bam files from many high-throughput sequencing technologies, and then builds a one-class model using these annotations to classify candidate SVs as likely true or false positives. RESULTS: We first used pedigree analysis to develop a set of high-confidence breakpoint-resolved large deletions. We then used svclassify to cluster and classify these deletions as well as a set of high-confidence deletions from the 1000 Genomes Project and a set of breakpoint-resolved complex insertions from Spiral Genetics. We find that likely SVs cluster separately from likely non-SVs based on our annotations, and that the SVs cluster into different types of deletions. We then developed a supervised one-class classification method that uses a training set of random non-SV regions to determine whether candidate SVs have abnormal annotations different from most of the genome. To test this classification method, we use our pedigree-based breakpoint-resolved SVs, SVs validated by the 1000 Genomes Project, and assembly-based breakpoint-resolved insertions, along with semi-automated visualization using svviz. CONCLUSIONS: We find that candidate SVs with high scores from multiple technologies have high concordance with PCR validation and an orthogonal consensus method MetaSV (99.7 % concordant), and candidate SVs with low scores are questionable. We distribute a set of 2676 high-confidence deletions and 68 high-confidence insertions with high svclassify scores from these call sets for benchmarking SV callers. We expect these methods to be particularly useful for establishing high-confidence SV calls for benchmark samples that have been characterized by multiple technologies.

Assuntos

Genoma Humano , Variação Estrutural do Genoma , Software , Benchmarking , Genômica , Sequenciamento de Nucleotídeos em Larga Escala , Humanos , Anotação de Sequência Molecular , Linhagem , Polimorfismo de Nucleotídeo Único/genética

9.

MetaSV: an accurate and integrative structural-variant caller for next generation sequencing.

Mohiyuddin, Marghoob; Mu, John C; Li, Jian; Bani Asadi, Narges; Gerstein, Mark B; Abyzov, Alexej; Wong, Wing H; Lam, Hugo Y K.

Bioinformatics ; 31(16): 2741-4, 2015 Aug 15.

Artigo em Inglês | MEDLINE | ID: mdl-25861968

RESUMO

UNLABELLED: Structural variations (SVs) are large genomic rearrangements that vary significantly in size, making them challenging to detect with the relatively short reads from next-generation sequencing (NGS). Different SV detection methods have been developed; however, each is limited to specific kinds of SVs with varying accuracy and resolution. Previous works have attempted to combine different methods, but they still suffer from poor accuracy particularly for insertions. We propose MetaSV, an integrated SV caller which leverages multiple orthogonal SV signals for high accuracy and resolution. MetaSV proceeds by merging SVs from multiple tools for all types of SVs. It also analyzes soft-clipped reads from alignment to detect insertions accurately since existing tools underestimate insertion SVs. Local assembly in combination with dynamic programming is used to improve breakpoint resolution. Paired-end and coverage information is used to predict SV genotypes. Using simulation and experimental data, we demonstrate the effectiveness of MetaSV across various SV types and sizes. AVAILABILITY AND IMPLEMENTATION: Code in Python is at http://bioinform.github.io/metasv/. CONTACT: rd@bina.com SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

Assuntos

Variação Genética , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Software , Mutagênese Insercional , Deleção de Sequência

10.

VarSim: a high-fidelity simulation and validation framework for high-throughput genome sequencing with cancer applications.

Mu, John C; Mohiyuddin, Marghoob; Li, Jian; Bani Asadi, Narges; Gerstein, Mark B; Abyzov, Alexej; Wong, Wing H; Lam, Hugo Y K.

Bioinformatics ; 31(9): 1469-71, 2015 May 01.

Artigo em Inglês | MEDLINE | ID: mdl-25524895

RESUMO

SUMMARY: VarSim is a framework for assessing alignment and variant calling accuracy in high-throughput genome sequencing through simulation or real data. In contrast to simulating a random mutation spectrum, it synthesizes diploid genomes with germline and somatic mutations based on a realistic model. This model leverages information such as previously reported mutations to make the synthetic genomes biologically relevant. VarSim simulates and validates a wide range of variants, including single nucleotide variants, small indels and large structural variants. It is an automated, comprehensive compute framework supporting parallel computation and multiple read simulators. Furthermore, we developed a novel map data structure to validate read alignments, a strategy to compare variants binned in size ranges and a lightweight, interactive, graphical report to visualize validation results with detailed statistics. Thus far, it is the most comprehensive validation tool for secondary analysis in next generation sequencing. AVAILABILITY AND IMPLEMENTATION: Code in Java and Python along with instructions to download the reads and variants is at http://bioinform.github.io/varsim. CONTACT: rd@bina.com SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

Assuntos

Variação Genética , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Software , Simulação por Computador , Genômica , Humanos , Mutação , Neoplasias/genética , Alinhamento de Sequência

11.

A comprehensive map of mobile element insertion polymorphisms in humans.

Stewart, Chip; Kural, Deniz; Strömberg, Michael P; Walker, Jerilyn A; Konkel, Miriam K; Stütz, Adrian M; Urban, Alexander E; Grubert, Fabian; Lam, Hugo Y K; Lee, Wan-Ping; Busby, Michele; Indap, Amit R; Garrison, Erik; Huff, Chad; Xing, Jinchuan; Snyder, Michael P; Jorde, Lynn B; Batzer, Mark A; Korbel, Jan O; Marth, Gabor T.

PLoS Genet ; 7(8): e1002236, 2011 Aug.

Artigo em Inglês | MEDLINE | ID: mdl-21876680

RESUMO

As a consequence of the accumulation of insertion events over evolutionary time, mobile elements now comprise nearly half of the human genome. The Alu, L1, and SVA mobile element families are still duplicating, generating variation between individual genomes. Mobile element insertions (MEI) have been identified as causes for genetic diseases, including hemophilia, neurofibromatosis, and various cancers. Here we present a comprehensive map of 7,380 MEI polymorphisms from the 1000 Genomes Project whole-genome sequencing data of 185 samples in three major populations detected with two detection methods. This catalog enables us to systematically study mutation rates, population segregation, genomic distribution, and functional properties of MEI polymorphisms and to compare MEI to SNP variation from the same individuals. Population allele frequencies of MEI and SNPs are described, broadly, by the same neutral ancestral processes despite vastly different mutation mechanisms and rates, except in coding regions where MEI are virtually absent, presumably due to strong negative selection. A direct comparison of MEI and SNP diversity levels suggests a differential mobile element insertion rate among populations.

Assuntos

Elementos de DNA Transponíveis , Genoma Humano , Polimorfismo de Nucleotídeo Único , Frequência do Gene , Genótipo , Heterozigoto , Humanos , Mutagênese Insercional , Taxa de Mutação

12.

Analysis of genomic variation in non-coding elements using population-scale sequencing data from the 1000 Genomes Project.

Mu, Xinmeng Jasmine; Lu, Zhi John; Kong, Yong; Lam, Hugo Y K; Gerstein, Mark B.

Nucleic Acids Res ; 39(16): 7058-76, 2011 Sep 01.

Artigo em Inglês | MEDLINE | ID: mdl-21596777

RESUMO

In the human genome, it has been estimated that considerably more sequence is under natural selection in non-coding regions [such as transcription-factor binding sites (TF-binding sites) and non-coding RNAs (ncRNAs)] compared to protein-coding ones. However, less attention has been paid to them. To study selective pressure on non-coding elements, we use next-generation sequencing data from the recently completed pilot phase of the 1000 Genomes Project, which, compared to traditional methods, allows for the characterization of a full spectrum of genomic variations, including single-nucleotide polymorphisms (SNPs), short insertions and deletions (indels) and structural variations (SVs). We develop a framework for combining these variation data with non-coding elements, calculating various population-based metrics to compare classes and subclasses of elements, and developing element-aware aggregation procedures to probe the internal structure of an element. Overall, we find that TF-binding sites and ncRNAs are less selectively constrained for SNPs than coding sequences (CDSs), but more constrained than a neutral reference. We also determine that the relative amounts of constraint for the three types of variations are, in general, correlated, but there are some differences: counter-intuitively, TF-binding sites and ncRNAs are more selectively constrained for indels than for SNPs, compared to CDSs. After inspecting the overall properties of a class of elements, we analyze selective pressure on subclasses within an element class, and show that the extent of selection is associated with the genomic properties of each subclass. We find, for instance, that ncRNAs with higher expression levels tend to be under stronger purifying selection, and the actual regions of TF-binding motifs are under stronger selective pressure than the corresponding peak regions. Further, we develop element-aware aggregation plots to analyze selective pressure across the linear structure of an element, with the confidence intervals evaluated using both simple bootstrapping and block bootstrapping techniques. We find, for example, that both micro-RNAs (particularly the seed regions) and their binding targets are under stronger selective pressure for SNPs than their immediate genomic surroundings. In addition, we demonstrate that substitutions in TF-binding motifs inversely correlate with site conservation, and SNPs unfavorable for motifs are under more selective constraints than favorable SNPs. Finally, to further investigate intra-element differences, we show that SVs have the tendency to use distinctive modes and mechanisms when they interact with genomic elements, such as enveloping whole gene(s) rather than disrupting them partially, as well as duplicating TF motifs in tandem.

Assuntos

DNA Intergênico/química , Variação Genética , Genoma Humano , Sítios de Ligação , Frequência do Gene , Genômica , Humanos , Mutação INDEL , MicroRNAs/química , Polimorfismo de Nucleotídeo Único , Proteínas/genética , Pseudogenes , RNA não Traduzido/metabolismo , Fatores de Transcrição/metabolismo

13.

Genome-wide identification of binding sites defines distinct functions for Caenorhabditis elegans PHA-4/FOXA in development and environmental response.

Zhong, Mei; Niu, Wei; Lu, Zhi John; Sarov, Mihail; Murray, John I; Janette, Judith; Raha, Debasish; Sheaffer, Karyn L; Lam, Hugo Y K; Preston, Elicia; Slightham, Cindie; Hillier, LaDeana W; Brock, Trisha; Agarwal, Ashish; Auerbach, Raymond; Hyman, Anthony A; Gerstein, Mark; Mango, Susan E; Kim, Stuart K; Waterston, Robert H; Reinke, Valerie; Snyder, Michael.

PLoS Genet ; 6(2): e1000848, 2010 Feb 19.

Artigo em Inglês | MEDLINE | ID: mdl-20174564

RESUMO

Transcription factors are key components of regulatory networks that control development, as well as the response to environmental stimuli. We have established an experimental pipeline in Caenorhabditis elegans that permits global identification of the binding sites for transcription factors using chromatin immunoprecipitation and deep sequencing. We describe and validate this strategy, and apply it to the transcription factor PHA-4, which plays critical roles in organ development and other cellular processes. We identified thousands of binding sites for PHA-4 during formation of the embryonic pharynx, and also found a role for this factor during the starvation response. Many binding sites were found to shift dramatically between embryos and starved larvae, from developmentally regulated genes to genes involved in metabolism. These results indicate distinct roles for this regulator in two different biological processes and demonstrate the versatility of transcription factors in mediating diverse biological roles.

Assuntos

Proteínas de Caenorhabditis elegans/metabolismo , Caenorhabditis elegans/crescimento & desenvolvimento , Caenorhabditis elegans/genética , Meio Ambiente , Genoma Helmíntico/genética , Transativadores/metabolismo , Animais , Sítios de Ligação , Proteínas de Caenorhabditis elegans/genética , Imunoprecipitação da Cromatina , Embrião não Mamífero/metabolismo , Regulação da Expressão Gênica no Desenvolvimento , Genes de Helmintos/genética , Proteínas de Fluorescência Verde/metabolismo , Larva/metabolismo , Ligação Proteica , RNA Polimerase II/metabolismo , Proteínas Recombinantes de Fusão/metabolismo , Inanição , Análise de Sobrevida , Transativadores/genética , Fatores de Transcrição/metabolismo

14.

Mitral regurgitation severity at left ventricular assist device implantation is associated with distinct myocardial transcriptomic signatures.

Duggal, Neal M; Lei, Ienglam; Wu, Xiaoting; Aaronson, Keith D; Pagani, Francis D; Lam, Hugo Y-K; Tang, Paul C.

J Thorac Cardiovasc Surg ; 166(1): 141-152.e1, 2023 07.

Artigo em Inglês | MEDLINE | ID: mdl-34689984

RESUMO

OBJECTIVES: We examined for differences in pre-left ventricular assist device (LVAD) implantation myocardial transcriptome signatures among patients with different degrees of mitral regurgitation (MR). METHODS: Between January 2018 and October 2019, we collected left ventricular (LV) cores during durable LVAD implantation (n = 72). A retrospective chart review was performed. Total RNA was isolated from LV cores and used to construct cDNA sequence libraries. The libraries were sequenced with the NovaSeq system, and data were quantified using Kallisto. Gene Set Enrichment Analysis (GSEA) and Gene Ontology analyses were performed, with a false discovery rate <0.05 considered significant. RESULTS: Comparing patients with preoperative mild or less MR (n = 30) and those with moderate-severe MR (n = 42), the moderate-severe MR group weighted less (P = .004) and had more tricuspid valve repairs (P = .043), without differences in demographics or comorbidities. We then compared both groups with a group of human donor hearts without heart failure (n = 8). Compared with the donor hearts, there were 3985 differentially expressed genes (DEGs) for mild or less MR and 4587 DEGs for moderate-severe MR. Specifically altered genes included 448 DEGs for specific for mild or less MR and 1050 DEGs for moderate-severe MR. On GSEA, common regulated genes showed increased immune gene expression and reduced expression of contraction and energetic genes. Of the 1050 genes specific for moderate-severe MR, there were additional up-regulated genes related to inflammation and reduced expression of genes related to cellular proliferation. CONCLUSIONS: Patients undergoing durable LVAD implantation with moderate-severe MR had increased activation of genes related to inflammation and reduction of cellular proliferation genes. This may have important implications for myocardial recovery.

Assuntos

Insuficiência Cardíaca , Transplante de Coração , Coração Auxiliar , Insuficiência da Valva Mitral , Humanos , Insuficiência da Valva Mitral/diagnóstico por imagem , Insuficiência da Valva Mitral/genética , Insuficiência da Valva Mitral/cirurgia , Transcriptoma , Estudos Retrospectivos , Resultado do Tratamento , Doadores de Tecidos , Insuficiência Cardíaca/genética , Insuficiência Cardíaca/cirurgia , Inflamação

15.

Measuring the evolutionary rewiring of biological networks.

Shou, Chong; Bhardwaj, Nitin; Lam, Hugo Y K; Yan, Koon-Kiu; Kim, Philip M; Snyder, Michael; Gerstein, Mark B.

PLoS Comput Biol ; 7(1): e1001050, 2011 Jan 06.

Artigo em Inglês | MEDLINE | ID: mdl-21253555

RESUMO

We have accumulated a large amount of biological network data and expect even more to come. Soon, we anticipate being able to compare many different biological networks as we commonly do for molecular sequences. It has long been believed that many of these networks change, or "rewire", at different rates. It is therefore important to develop a framework to quantify the differences between networks in a unified fashion. We developed such a formalism based on analogy to simple models of sequence evolution, and used it to conduct a systematic study of network rewiring on all the currently available biological networks. We found that, similar to sequences, biological networks show a decreased rate of change at large time divergences, because of saturation in potential substitutions. However, different types of biological networks consistently rewire at different rates. Using comparative genomics and proteomics data, we found a consistent ordering of the rewiring rates: transcription regulatory, phosphorylation regulatory, genetic interaction, miRNA regulatory, protein interaction, and metabolic pathway network, from fast to slow. This ordering was found in all comparisons we did of matched networks between organisms. To gain further intuition on network rewiring, we compared our observed rewirings with those obtained from simulation. We also investigated how readily our formalism could be mapped to other network contexts; in particular, we showed how it could be applied to analyze changes in a range of "commonplace" networks such as family trees, co-authorships and linux-kernel function dependencies.

Assuntos

Evolução Biológica , Genômica , Proteômica

16.

Segmental duplications in the human genome reveal details of pseudogene formation.

Khurana, Ekta; Lam, Hugo Y K; Cheng, Chao; Carriero, Nicholas; Cayting, Philip; Gerstein, Mark B.

Nucleic Acids Res ; 38(20): 6997-7007, 2010 Nov.

Artigo em Inglês | MEDLINE | ID: mdl-20615899

RESUMO

Duplicated pseudogenes in the human genome are disabled copies of functioning parent genes. They result from block duplication events occurring throughout evolutionary history. Relatively recent duplications (with sequence similarity≥90% and length≥1 kb) are termed segmental duplications (SDs); here, we analyze the interrelationship of SDs and pseudogenes. We present a decision-tree approach to classify pseudogenes based on their (and their parents') characteristics in relation to SDs. The classification identifies 140 novel pseudogenes and makes possible improved annotation for the 3172 pseudogenes located in SDs. In particular, it reveals that many pseudogenes in SDs likely did not arise directly from parent genes, but are the result of a multi-step process. In these cases, the initial duplication or retrotransposition of a parent gene gives rise to a 'parent pseudogene', followed by further duplication creating duplicated-duplicated or duplicated-processed pseudogenes, respectively. Moreover, we can precisely identify these parent pseudogenes by overlap with ancestral SD loci. Finally, a comparison of nucleotide substitutions per site in a pseudogene with its surrounding SD region allows us to estimate the time difference between duplication and disablement events, and this suggests that most duplicated pseudogenes in SDs were likely disabled around the time of the original duplication.

Assuntos

Genoma Humano , Pseudogenes , Duplicações Segmentares Genômicas , Evolução Molecular , Duplicação Gênica , Loci Gênicos , Humanos

17.

Gene Expression Scoring of Immune Activity Levels for Precision Use of Hydrocortisone in Vasodilatory Shock.

Yao, Lijing; Rey, Diego Ariel; Bulgarelli, Lucas; Kast, Rachel; Osborn, Jeff; Van Ark, Emily; Fang, Li Tai; Lau, Bayo; Lam, Hugo; Teixeira, Leonardo Maestri; Neto, Ary Serpa; Bellomo, Rinaldo; Deliberato, Rodrigo Octávio.

Shock ; 57(3): 384-391, 2022 03 01.

Artigo em Inglês | MEDLINE | ID: mdl-35081076

RESUMO

PURPOSE: Among patients with vasodilatory shock, gene expression scores may identify different immune states. We aimed to test whether such scores are robust in identifying patients' immune state and predicting response to hydrocortisone treatment in vasodilatory shock. MATERIALS AND METHODS: We selected genes to generate continuous scores to define previously established subclasses of sepsis. We used these scores to identify a patient's immune state. We evaluated the potential for these states to assess the differential effect of hydrocortisone in two randomized clinical trials of hydrocortisone versus placebo in vasodilatory shock. RESULTS: We initially identified genes associated with immune-adaptive, immune-innate, immune-coagulant functions. From these genes, 15 were most relevant to generate expression scores related to each of the functions. These scores were used to identify patients as immune-adaptive prevalent (IA-P) and immune-innate prevalent (IN-P). In IA-P patients, hydrocortisone therapy increased 28-day mortality in both trials (43.3% vs 14.7%, Pâ=â0.028) and (57.1% vs 0.0%, Pâ=â0.99). In IN-P patients, this effect was numerically reversed. CONCLUSIONS: Gene expression scores identified the immune state of vasodilatory shock patients, one of which (IA-P) identified those who may be harmed by hydrocortisone. Gene expression scores may help advance the field of personalized medicine.

Assuntos

Anti-Inflamatórios/uso terapêutico , Expressão Gênica/fisiologia , Hidrocortisona/uso terapêutico , Imunidade/genética , Choque/tratamento farmacológico , Choque/imunologia , Idoso , Feminino , Humanos , Masculino , Pessoa de Meia-Idade , Medicina de Precisão , Estudos Retrospectivos , Choque/genética

18.

Identification of genomic indels and structural variations using split reads.

Zhang, Zhengdong D; Du, Jiang; Lam, Hugo; Abyzov, Alex; Urban, Alexander E; Snyder, Michael; Gerstein, Mark.

BMC Genomics ; 12: 375, 2011 Jul 25.

Artigo em Inglês | MEDLINE | ID: mdl-21787423

RESUMO

BACKGROUND: Recent studies have demonstrated the genetic significance of insertions, deletions, and other more complex structural variants (SVs) in the human population. With the development of the next-generation sequencing technologies, high-throughput surveys of SVs on the whole-genome level have become possible. Here we present split-read identification, calibrated (SRiC), a sequence-based method for SV detection. RESULTS: We start by mapping each read to the reference genome in standard fashion using gapped alignment. Then to identify SVs, we score each of the many initial mappings with an assessment strategy designed to take into account both sequencing and alignment errors (e.g. scoring more highly events gapped in the center of a read). All current SV calling methods have multilevel biases in their identifications due to both experimental and computational limitations (e.g. calling more deletions than insertions). A key aspect of our approach is that we calibrate all our calls against synthetic data sets generated from simulations of high-throughput sequencing (with realistic error models). This allows us to calculate sensitivity and the positive predictive value under different parameter-value scenarios and for different classes of events (e.g. long deletions vs. short insertions). We run our calculations on representative data from the 1000 Genomes Project. Coupling the observed numbers of events on chromosome 1 with the calibrations gleaned from the simulations (for different length events) allows us to construct a relatively unbiased estimate for the total number of SVs in the human genome across a wide range of length scales. We estimate in particular that an individual genome contains ~670,000 indels/SVs. CONCLUSIONS: Compared with the existing read-depth and read-pair approaches for SV identification, our method can pinpoint the exact breakpoints of SV events, reveal the actual sequence content of insertions, and cover the whole size spectrum for deletions. Moreover, with the advent of the third-generation sequencing technologies that produce longer reads, we expect our method to be even more useful.

Assuntos

Genômica/métodos , Mutação INDEL/genética , Diploide , Genoma Humano/genética , Humanos , Reprodutibilidade dos Testes , Análise de Sequência

19.

Pseudofam: the pseudogene families database.

Lam, Hugo Y K; Khurana, Ekta; Fang, Gang; Cayting, Philip; Carriero, Nicholas; Cheung, Kei-Hoi; Gerstein, Mark B.

Nucleic Acids Res ; 37(Database issue): D738-43, 2009 Jan.

Artigo em Inglês | MEDLINE | ID: mdl-18957444

RESUMO

Pseudofam (http://pseudofam.pseudogene.org) is a database of pseudogene families based on the protein families from the Pfam database. It provides resources for analyzing the family structure of pseudogenes including query tools, statistical summaries and sequence alignments. The current version of Pseudofam contains more than 125,000 pseudogenes identified from 10 eukaryotic genomes and aligned within nearly 3000 families (approximately one-third of the total families in PfamA). Pseudofam uses a large-scale parallelized homology search algorithm (implemented as an extension of the PseudoPipe pipeline) to identify pseudogenes. Each identified pseudogene is assigned to its parent protein family and subsequently aligned to each other by transferring the parent domain alignments from the Pfam family. Pseudogenes are also given additional annotation based on an ontology, reflecting their mode of creation and subsequent history. In particular, our annotation highlights the association of pseudogene families with genomic features, such as segmental duplications. In addition, pseudogene families are associated with key statistics, which identify outlier families with an unusual degree of pseudogenization. The statistics also show how the number of genes and pseudogenes in families correlates across different species. Overall, they highlight the fact that housekeeping families tend to be enriched with a large number of pseudogenes.

Assuntos

Bases de Dados Genéticas , Pseudogenes , Animais , Interpretação Estatística de Dados , Genômica , Humanos , Internet , Proteínas/classificação , Proteínas/genética , Alinhamento de Sequência

20.

Differential inflammatory responses of the native left and right ventricle associated with donor heart preservation.

Lei, Ienglam; Huang, Wei; Ward, Peter A; Pober, Jordan S; Tellides, George; Ailawadi, Gorav; Pagani, Francis D; Landstrom, Andrew P; Wang, Zhong; Mortensen, Richard M; Cascalho, Marilia; Platt, Jeffrey; Eugene Chen, Yuqing; Lam, Hugo Yu Kor; Tang, Paul C.

Physiol Rep ; 9(17): e15004, 2021 09.

Artigo em Inglês | MEDLINE | ID: mdl-34435466

RESUMO

BACKGROUND: Dysfunction and inflammation of hearts subjected to cold ischemic preservation may differ between left and right ventricles, suggesting distinct strategies for amelioration. METHODS AND RESULTS: Explanted murine hearts subjected to cold ischemia for 0, 4, or 8 h in preservation solution were assessed for function during 60 min of warm perfusion and then analyzed for cell death and inflammation by immunohistochemistry and western blotting and total RNA sequencing. Increased cold ischemic times led to greater left ventricle (LV) dysfunction compared to right ventricle (RV). The LV experienced greater cell death assessed by TUNEL+ cells and cleaved caspase-3 expression (n = 4). While IL-6 protein levels were upregulated in both LV and RV, IL-1ß, TNFα, IL-10, and MyD88 were disproportionately increased in the LV. Inflammasome components (NOD-, LRR-, and pyrin domain-containing protein 3 (NLRP3), adaptor molecule apoptosis-associated speck-like protein containing a CARD (ASC), cleaved caspase-1) and products (cleaved IL-1ß and gasdermin D) were also more upregulated in the LV. Pathway analysis of RNA sequencing showed increased signaling related to tumor necrosis factor, interferon, and innate immunity with ex-vivo ischemia, but no significant differences were found between the LV and RV. Human donor hearts showed comparable inflammatory responses to cold ischemia with greater LV increases of TNFα, IL-10, and inflammasomes (n = 3). CONCLUSIONS: Mouse hearts subjected to cold ischemia showed time-dependent contractile dysfunction and increased cell death, inflammatory cytokine expression and inflammasome expression that are greater in the LV than RV. However, IL-6 protein elevations and altered transcriptional profiles were similar in both ventricles. Similar changes are observed in human hearts.

Assuntos

Ventrículos do Coração/metabolismo , Mediadores da Inflamação/metabolismo , Isquemia Miocárdica/metabolismo , Soluções para Preservação de Órgãos/administração & dosagem , Disfunção Ventricular Esquerda/metabolismo , Disfunção Ventricular Direita/metabolismo , Animais , Temperatura Baixa/efeitos adversos , Feminino , Transplante de Coração/métodos , Ventrículos do Coração/efeitos dos fármacos , Humanos , Masculino , Camundongos , Camundongos Endogâmicos C57BL , Pessoa de Meia-Idade , Isquemia Miocárdica/fisiopatologia , Doadores de Tecidos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA