Search | VHL Search Portal

1.

Establishing Cerebral Organoids as Models of Human-Specific Brain Evolution.

Pollen, Alex A; Bhaduri, Aparna; Andrews, Madeline G; Nowakowski, Tomasz J; Meyerson, Olivia S; Mostajo-Radji, Mohammed A; Di Lullo, Elizabeth; Alvarado, Beatriz; Bedolli, Melanie; Dougherty, Max L; Fiddes, Ian T; Kronenberg, Zev N; Shuga, Joe; Leyrat, Anne A; West, Jay A; Bershteyn, Marina; Lowe, Craig B; Pavlovic, Bryan J; Salama, Sofie R; Haussler, David; Eichler, Evan E; Kriegstein, Arnold R.

Cell ; 176(4): 743-756.e17, 2019 02 07.

Article in English | MEDLINE | ID: mdl-30735633

ABSTRACT

Direct comparisons of human and non-human primate brains can reveal molecular pathways underlying remarkable specializations of the human brain. However, chimpanzee tissue is inaccessible during neocortical neurogenesis when differences in brain size first appear. To identify human-specific features of cortical development, we leveraged recent innovations that permit generating pluripotent stem cell-derived cerebral organoids from chimpanzee. Despite metabolic differences, organoid models preserve gene regulatory networks related to primary cell types and developmental processes. We further identified 261 differentially expressed genes in human compared to both chimpanzee organoids and macaque cortex, enriched for recent gene duplications, and including multiple regulators of PI3K-AKT-mTOR signaling. We observed increased activation of this pathway in human radial glia, dependent on two receptors upregulated specifically in human: INSR and ITGB8. Our findings establish a platform for systematic analysis of molecular changes contributing to human brain development and evolution.

Subject(s)

Cerebral Cortex/cytology , Organoids/metabolism , Animals , Biological Evolution , Brain/cytology , Cell Culture Techniques/methods , Cell Differentiation/genetics , Cerebral Cortex/metabolism , Gene Regulatory Networks/genetics , Humans , Induced Pluripotent Stem Cells/cytology , Macaca , Neurogenesis/genetics , Organoids/growth & development , Pan troglodytes , Pluripotent Stem Cells/cytology , Single-Cell Analysis , Species Specificity , Transcriptome/genetics

2.

Human-Specific NOTCH2NL Genes Affect Notch Signaling and Cortical Neurogenesis.

Fiddes, Ian T; Lodewijk, Gerrald A; Mooring, Meghan; Bosworth, Colleen M; Ewing, Adam D; Mantalas, Gary L; Novak, Adam M; van den Bout, Anouk; Bishara, Alex; Rosenkrantz, Jimi L; Lorig-Roach, Ryan; Field, Andrew R; Haeussler, Maximilian; Russo, Lotte; Bhaduri, Aparna; Nowakowski, Tomasz J; Pollen, Alex A; Dougherty, Max L; Nuttle, Xander; Addor, Marie-Claude; Zwolinski, Simon; Katzman, Sol; Kriegstein, Arnold; Eichler, Evan E; Salama, Sofie R; Jacobs, Frank M J; Haussler, David.

Cell ; 173(6): 1356-1369.e22, 2018 05 31.

Article in English | MEDLINE | ID: mdl-29856954

ABSTRACT

Genetic changes causing brain size expansion in human evolution have remained elusive. Notch signaling is essential for radial glia stem cell proliferation and is a determinant of neuronal number in the mammalian cortex. We find that three paralogs of human-specific NOTCH2NL are highly expressed in radial glia. Functional analysis reveals that different alleles of NOTCH2NL have varying potencies to enhance Notch signaling by interacting directly with NOTCH receptors. Consistent with a role in Notch signaling, NOTCH2NL ectopic expression delays differentiation of neuronal progenitors, while deletion accelerates differentiation into cortical neurons. Furthermore, NOTCH2NL genes provide the breakpoints in 1q21.1 distal deletion/duplication syndrome, where duplications are associated with macrocephaly and autism and deletions with microcephaly and schizophrenia. Thus, the emergence of human-specific NOTCH2NL genes may have contributed to the rapid evolution of the larger human neocortex, accompanied by loss of genomic stability at the 1q21.1 locus and resulting recurrent neurodevelopmental disorders.

Subject(s)

Brain/embryology , Cerebral Cortex/physiology , Neurogenesis/physiology , Receptor, Notch2/metabolism , Signal Transduction , Animals , Cell Differentiation , Embryonic Stem Cells/metabolism , Female , Gene Deletion , Genes, Reporter , Gorilla gorilla , HEK293 Cells , Humans , Neocortex/cytology , Neural Stem Cells/metabolism , Neuroglia/metabolism , Neurons/metabolism , Pan troglodytes , Receptor, Notch2/genetics , Sequence Analysis, RNA

3.

Molecular Profiling Reveals Biologically Discrete Subsets and Pathways of Progression in Diffuse Glioma.

Ceccarelli, Michele; Barthel, Floris P; Malta, Tathiane M; Sabedot, Thais S; Salama, Sofie R; Murray, Bradley A; Morozova, Olena; Newton, Yulia; Radenbaugh, Amie; Pagnotta, Stefano M; Anjum, Samreen; Wang, Jiguang; Manyam, Ganiraju; Zoppoli, Pietro; Ling, Shiyun; Rao, Arjun A; Grifford, Mia; Cherniack, Andrew D; Zhang, Hailei; Poisson, Laila; Carlotti, Carlos Gilberto; Tirapelli, Daniela Pretti da Cunha; Rao, Arvind; Mikkelsen, Tom; Lau, Ching C; Yung, W K Alfred; Rabadan, Raul; Huse, Jason; Brat, Daniel J; Lehman, Norman L; Barnholtz-Sloan, Jill S; Zheng, Siyuan; Hess, Kenneth; Rao, Ganesh; Meyerson, Matthew; Beroukhim, Rameen; Cooper, Lee; Akbani, Rehan; Wrensch, Margaret; Haussler, David; Aldape, Kenneth D; Laird, Peter W; Gutmann, David H; Noushmehr, Houtan; Iavarone, Antonio; Verhaak, Roel G W.

Cell ; 164(3): 550-63, 2016 Jan 28.

Article in English | MEDLINE | ID: mdl-26824661

ABSTRACT

Therapy development for adult diffuse glioma is hindered by incomplete knowledge of somatic glioma driving alterations and suboptimal disease classification. We defined the complete set of genes associated with 1,122 diffuse grade II-III-IV gliomas from The Cancer Genome Atlas and used molecular profiles to improve disease classification, identify molecular correlations, and provide insights into the progression from low- to high-grade disease. Whole-genome sequencing data analysis determined that ATRX but not TERT promoter mutations are associated with increased telomere length. Recent advances in glioma classification based on IDH mutation and 1p/19q co-deletion status were recapitulated through analysis of DNA methylation profiles, which identified clinically relevant molecular subsets. A subtype of IDH mutant glioma was associated with DNA demethylation and poor outcome; a group of IDH-wild-type diffuse glioma showed molecular similarity to pilocytic astrocytoma and relatively favorable survival. Understanding of cohesive disease groups may aid improved clinical outcomes.

Subject(s)

Brain Neoplasms/genetics , Brain Neoplasms/pathology , Glioma/genetics , Glioma/pathology , Transcriptome , Adult , Brain Neoplasms/metabolism , Cell Proliferation , Cluster Analysis , DNA Helicases/genetics , DNA Methylation , Epigenesis, Genetic , Glioma/metabolism , Humans , Isocitrate Dehydrogenase/genetics , Middle Aged , Mutation , Nuclear Proteins/genetics , Promoter Regions, Genetic , Signal Transduction , Telomerase/genetics , Telomere , X-linked Nuclear Protein

4.

Pandemic-scale phylogenomics reveals the SARS-CoV-2 recombination landscape.

Turakhia, Yatish; Thornlow, Bryan; Hinrichs, Angie; McBroome, Jakob; Ayala, Nicolas; Ye, Cheng; Smith, Kyle; De Maio, Nicola; Haussler, David; Lanfear, Robert; Corbett-Detig, Russell.

Nature ; 609(7929): 994-997, 2022 09.

Article in English | MEDLINE | ID: mdl-35952714

ABSTRACT

Accurate and timely detection of recombinant lineages is crucial for interpreting genetic variation, reconstructing epidemic spread, identifying selection and variants of interest, and accurately performing phylogenetic analyses1-4. During the SARS-CoV-2 pandemic, genomic data generation has exceeded the capacities of existing analysis platforms, thereby crippling real-time analysis of viral evolution5. Here, we use a new phylogenomic method to search a nearly comprehensive SARS-CoV-2 phylogeny for recombinant lineages. In a 1.6 million sample tree from May 2021, we identify 589 recombination events, which indicate that around 2.7% of sequenced SARS-CoV-2 genomes have detectable recombinant ancestry. Recombination breakpoints are inferred to occur disproportionately in the 3' portion of the genome that contains the spike protein. Our results highlight the need for timely analyses of recombination for pinpointing the emergence of recombinant lineages with the potential to increase transmissibility or virulence of the virus. We anticipate that this approach will empower comprehensive real-time tracking of viral recombination during the SARS-CoV-2 pandemic and beyond.

Subject(s)

COVID-19 , Genome, Viral , Pandemics , Phylogeny , Recombination, Genetic , SARS-CoV-2 , COVID-19/epidemiology , COVID-19/transmission , COVID-19/virology , Genome, Viral/genetics , Humans , Mutation , Recombination, Genetic/genetics , SARS-CoV-2/genetics , SARS-CoV-2/pathogenicity , Selection, Genetic/genetics , Spike Glycoprotein, Coronavirus/genetics , Virulence/genetics

5.

The Human Pangenome Project: a global resource to map genomic diversity.

Wang, Ting; Antonacci-Fulton, Lucinda; Howe, Kerstin; Lawson, Heather A; Lucas, Julian K; Phillippy, Adam M; Popejoy, Alice B; Asri, Mobin; Carson, Caryn; Chaisson, Mark J P; Chang, Xian; Cook-Deegan, Robert; Felsenfeld, Adam L; Fulton, Robert S; Garrison, Erik P; Garrison, Nanibaa' A; Graves-Lindsay, Tina A; Ji, Hanlee; Kenny, Eimear E; Koenig, Barbara A; Li, Daofeng; Marschall, Tobias; McMichael, Joshua F; Novak, Adam M; Purushotham, Deepak; Schneider, Valerie A; Schultz, Baergen I; Smith, Michael W; Sofia, Heidi J; Weissman, Tsachy; Flicek, Paul; Li, Heng; Miga, Karen H; Paten, Benedict; Jarvis, Erich D; Hall, Ira M; Eichler, Evan E; Haussler, David.

Nature ; 604(7906): 437-446, 2022 04.

Article in English | MEDLINE | ID: mdl-35444317

ABSTRACT

The human reference genome is the most widely used resource in human genetics and is due for a major update. Its current structure is a linear composite of merged haplotypes from more than 20 people, with a single individual comprising most of the sequence. It contains biases and errors within a framework that does not represent global human genomic variation. A high-quality reference with global representation of common variants, including single-nucleotide variants, structural variants and functional elements, is needed. The Human Pangenome Reference Consortium aims to create a more sophisticated and complete human reference genome with a graph-based, telomere-to-telomere representation of global genomic diversity. Here we leverage innovations in technology, study design and global partnerships with the goal of constructing the highest-possible quality human pangenome reference. Our goal is to improve data representation and streamline analyses to enable routine assembly of complete diploid genomes. With attention to ethical frameworks, the human pangenome reference will contain a more accurate and diverse representation of global genomic variation, improve gene-disease association studies across populations, expand the scope of genomics research to the most repetitive and polymorphic regions of the genome, and serve as the ultimate genetic resource for future biomedical research and precision medicine.

Subject(s)

Genome, Human , Genomics , Genome, Human/genetics , Haplotypes/genetics , High-Throughput Nucleotide Sequencing , Humans , Sequence Analysis, DNA

6.

Progressive Cactus is a multiple-genome aligner for the thousand-genome era.

Armstrong, Joel; Hickey, Glenn; Diekhans, Mark; Fiddes, Ian T; Novak, Adam M; Deran, Alden; Fang, Qi; Xie, Duo; Feng, Shaohong; Stiller, Josefin; Genereux, Diane; Johnson, Jeremy; Marinescu, Voichita Dana; Alföldi, Jessica; Harris, Robert S; Lindblad-Toh, Kerstin; Haussler, David; Karlsson, Elinor; Jarvis, Erich D; Zhang, Guojie; Paten, Benedict.

Nature ; 587(7833): 246-251, 2020 11.

Article in English | MEDLINE | ID: mdl-33177663

ABSTRACT

New genome assemblies have been arriving at a rapidly increasing pace, thanks to decreases in sequencing costs and improvements in third-generation sequencing technologies1-3. For example, the number of vertebrate genome assemblies currently in the NCBI (National Center for Biotechnology Information) database4 increased by more than 50% to 1,485 assemblies in the year from July 2018 to July 2019. In addition to this influx of assemblies from different species, new human de novo assemblies5 are being produced, which enable the analysis of not only small polymorphisms, but also complex, large-scale structural differences between human individuals and haplotypes. This coming era and its unprecedented amount of data offer the opportunity to uncover many insights into genome evolution but also present challenges in how to adapt current analysis methods to meet the increased scale. Cactus6, a reference-free multiple genome alignment program, has been shown to be highly accurate, but the existing implementation scales poorly with increasing numbers of genomes, and struggles in regions of highly duplicated sequences. Here we describe progressive extensions to Cactus to create Progressive Cactus, which enables the reference-free alignment of tens to thousands of large vertebrate genomes while maintaining high alignment quality. We describe results from an alignment of more than 600 amniote genomes, which is to our knowledge the largest multiple vertebrate genome alignment created so far.

Subject(s)

Genome/genetics , Genomics/methods , Sequence Alignment/methods , Software , Vertebrates/genetics , Amnion , Animals , Computer Simulation , Genomics/standards , Haplotypes , Humans , Quality Control , Sequence Alignment/standards , Software/standards

7.

The UCSC Genome Browser database: 2024 update.

Raney, Brian J; Barber, Galt P; Benet-Pagès, Anna; Casper, Jonathan; Clawson, Hiram; Cline, Melissa S; Diekhans, Mark; Fischer, Clayton; Navarro Gonzalez, Jairo; Hickey, Glenn; Hinrichs, Angie S; Kuhn, Robert M; Lee, Brian T; Lee, Christopher M; Le Mercier, Phillipe; Miga, Karen H; Nassar, Luis R; Nejad, Parisa; Paten, Benedict; Perez, Gerardo; Schmelter, Daniel; Speir, Matthew L; Wick, Brittney D; Zweig, Ann S; Haussler, David; Kent, W James; Haeussler, Maximilian.

Nucleic Acids Res ; 52(D1): D1082-D1088, 2024 Jan 05.

Article in English | MEDLINE | ID: mdl-37953330

ABSTRACT

The UCSC Genome Browser (https://genome.ucsc.edu) is a web-based genomic visualization and analysis tool that serves data to over 7,000 distinct users per day worldwide. It provides annotation data on thousands of genome assemblies, ranging from human to SARS-CoV2. This year, we have introduced new data from the Human Pangenome Reference Consortium and on viral genomes including SARS-CoV2. We have added 1,200 new genomes to our GenArk genome system, increasing the overall diversity of our genomic representation. We have added support for nine new user-contributed track hubs to our public hub system. Additionally, we have released 29 new tracks on the human genome and 11 new tracks on the mouse genome. Collectively, these new features expand both the breadth and depth of the genomic knowledge that we share publicly with users worldwide.

Subject(s)

Databases, Genetic , Genomics , RNA, Viral , Animals , Humans , Mice , Genome, Human , Genome, Viral , Internet , Molecular Sequence Annotation , Software

8.

A complete pedigree-based graph workflow for rare candidate variant analysis.

Markello, Charles; Huang, Charles; Rodriguez, Alex; Carroll, Andrew; Chang, Pi-Chuan; Eizenga, Jordan; Markello, Thomas; Haussler, David; Paten, Benedict.

Genome Res ; 32(5): 893-903, 2022 05.

Article in English | MEDLINE | ID: mdl-35483961

ABSTRACT

Methods that use a linear genome reference for genome sequencing data analysis are reference-biased. In the field of clinical genetics for rare diseases, a resulting reduction in genotyping accuracy in some regions has likely prevented the resolution of some cases. Pangenome graphs embed population variation into a reference structure. Although pangenome graphs have helped to reduce reference mapping bias, further performance improvements are possible. We introduce VG-Pedigree, a pedigree-aware workflow based on the pangenome-mapping tool of Giraffe and the variant calling tool DeepTrio using a specially trained model for Giraffe-based alignments. We demonstrate mapping and variant calling improvements in both single-nucleotide variants (SNVs) and insertion and deletion (indel) variants over those produced by alignments created using BWA-MEM to a linear-reference and Giraffe mapping to a pangenome graph containing data from the 1000 Genomes Project. We have also adapted and upgraded deleterious-variant (DV) detecting methods and programs into a streamlined workflow. We used these workflows in combination to detect small lists of candidate DVs among 15 family quartets and quintets of the Undiagnosed Diseases Program (UDP). All candidate DVs that were previously diagnosed using the Mendelian models covered by the previously published methods were recapitulated by these workflows. The results of these experiments indicate that a slightly greater absolute count of DVs are detected in the proband population than in their matched unaffected siblings.

Subject(s)

Genome , Polymorphism, Single Nucleotide , High-Throughput Nucleotide Sequencing , INDEL Mutation , Pedigree , Software , Workflow

9.

A hidden layer of structural variation in transposable elements reveals potential genetic modifiers in human disease-risk loci.

van Bree, Elisabeth J; Guimarães, Rita L F P; Lundberg, Mischa; Blujdea, Elena R; Rosenkrantz, Jimi L; White, Fred T G; Poppinga, Josse; Ferrer-Raventós, Paula; Schneider, Anne-Fleur E; Clayton, Isabella; Haussler, David; Reinders, Marcel J T; Holstege, Henne; Ewing, Adam D; Moses, Colette; Jacobs, Frank M J.

Genome Res ; 32(4): 656-670, 2022 04.

Article in English | MEDLINE | ID: mdl-35332097

ABSTRACT

Genome-wide association studies (GWAS) have been highly informative in discovering disease-associated loci but are not designed to capture all structural variations in the human genome. Using long-read sequencing data, we discovered widespread structural variation within SINE-VNTR-Alu (SVA) elements, a class of great ape-specific transposable elements with gene-regulatory roles, which represents a major source of structural variability in the human population. We highlight the presence of structurally variable SVAs (SV-SVAs) in neurological disease-associated loci, and we further associate SV-SVAs to disease-associated SNPs and differential gene expression using luciferase assays and expression quantitative trait loci data. Finally, we genetically deleted SV-SVAs in the BIN1 and CD2AP Alzheimer's disease-associated risk loci and in the BCKDK Parkinson's disease-associated risk locus and assessed multiple aspects of their gene-regulatory influence in a human neuronal context. Together, this study reveals a novel layer of genetic variation in transposable elements that may contribute to identification of the structural variants that are the actual drivers of disease associations of GWAS loci.

Subject(s)

DNA Transposable Elements , Genome-Wide Association Study , Alu Elements , DNA Transposable Elements/genetics , Genetic Predisposition to Disease , Genetic Variation , Genome, Human , Humans , Polymorphism, Single Nucleotide , Quantitative Trait Loci

10.

The UCSC Genome Browser database: 2023 update.

Nassar, Luis R; Barber, Galt P; Benet-Pagès, Anna; Casper, Jonathan; Clawson, Hiram; Diekhans, Mark; Fischer, Clay; Gonzalez, Jairo Navarro; Hinrichs, Angie S; Lee, Brian T; Lee, Christopher M; Muthuraman, Pranav; Nguy, Beagan; Pereira, Tiana; Nejad, Parisa; Perez, Gerardo; Raney, Brian J; Schmelter, Daniel; Speir, Matthew L; Wick, Brittney D; Zweig, Ann S; Haussler, David; Kuhn, Robert M; Haeussler, Maximilian; Kent, W James.

Nucleic Acids Res ; 51(D1): D1188-D1195, 2023 01 06.

Article in English | MEDLINE | ID: mdl-36420891

ABSTRACT

The UCSC Genome Browser (https://genome.ucsc.edu) is an omics data consolidator, graphical viewer, and general bioinformatics resource that continues to serve the community as it enters its 23rd year. This year has seen an emphasis in clinical data, with new tracks and an expanded Recommended Track Sets feature on hg38 as well as the addition of a single cell track group. SARS-CoV-2 continues to remain a focus, with regular annotation updates to the browser and continued curation of our phylogenetic sequence placing tool, hgPhyloPlace, whose tree has now reached over 12M sequences. Our GenArk resource has also grown, offering over 2500 hubs and a system for users to request any absent assemblies. We have expanded our bigBarChart display type and created new ways to visualize data via bigRmsk and dynseq display. Displaying custom annotations is now easier due to our chromAlias system which eliminates the requirement for renaming sequence names to the UCSC standard. Users involved in data generation may also be interested in our new tools and trackDb settings which facilitate the creation and display of their custom annotations.

Subject(s)

Databases, Genetic , Genomics , Humans , COVID-19/epidemiology , COVID-19/genetics , Genomics/methods , Internet , Phylogeny , SARS-CoV-2/genetics , Software , Web Browser

11.

Positive selection in noncoding genomic regions of vocal learning birds is associated with genes implicated in vocal learning and speech functions in humans.

Cahill, James A; Armstrong, Joel; Deran, Alden; Khoury, Carolyn J; Paten, Benedict; Haussler, David; Jarvis, Erich D.

Genome Res ; 31(11): 2035-2049, 2021 11.

Article in English | MEDLINE | ID: mdl-34667117

ABSTRACT

Vocal learning, the ability to imitate sounds from conspecifics and the environment, is a key component of human spoken language and learned song in three independently evolved avian groups-oscine songbirds, parrots, and hummingbirds. Humans and each of these three bird clades exhibit specialized behavioral, neuroanatomical, and brain gene expression convergence related to vocal learning, speech, and song. To understand the evolutionary basis of vocal learning gene specializations and convergence, we searched for and identified accelerated genomic regions (ARs), a marker of positive selection, specific to vocal learning birds. We found avian vocal learner-specific ARs, and they were enriched in noncoding regions near genes with known speech functions or brain gene expression specializations in humans and vocal learning birds, including FOXP2, NEUROD6, ZEB2, and MEF2C, and near genes with major neurodevelopmental functions, including NR2F1, NRP2, and BCL11B We also found enrichment near the SFARI class S genes associated with syndromic vocal communication forms of autism spectrum disorders. These findings reveal strong candidate noncoding regions near genes for the evolutionary adaptations that distinguish vocal learning species from their close vocal nonlearning relatives and provide further evidence of molecular convergence between birdsong and human spoken language.

Subject(s)

Songbirds , Speech , Animals , Brain/metabolism , Genomics , Humans , Learning , Repressor Proteins/metabolism , Songbirds/genetics , Tumor Suppressor Proteins/metabolism , Vocalization, Animal

12.

Posttranscriptional crossregulation between Drosha and DGCR8.

Han, Jinju; Pedersen, Jakob S; Kwon, S Chul; Belair, Cassandra D; Kim, Young-Kook; Yeom, Kyu-Hyeon; Yang, Woo-Young; Haussler, David; Blelloch, Robert; Kim, V Narry.

Cell ; 136(1): 75-84, 2009 Jan 09.

Article in English | MEDLINE | ID: mdl-19135890

ABSTRACT

The Drosha-DGCR8 complex, also known as Microprocessor, is essential for microRNA (miRNA) maturation. Drosha functions as the catalytic subunit, while DGCR8 (also known as Pasha) recognizes the RNA substrate. Although the action mechanism of this complex has been intensively studied, it remains unclear how Drosha and DGCR8 are regulated and if these proteins have any additional role(s) apart from miRNA processing. Here, we report that Drosha and DGCR8 regulate each other posttranscriptionally. The Drosha-DGCR8 complex cleaves the hairpin structures embedded in the DGCR8 mRNA and thereby destabilizes the mRNA. We further find that DGCR8 stabilizes the Drosha protein via protein-protein interaction. This crossregulation between Drosha and DGCR8 may contribute to the homeostatic control of miRNA biogenesis. Furthermore, microarray analyses suggest that a number of mRNAs may be downregulated in a Microprocessor-dependent, miRNA-independent manner. Our study reveals a previously unsuspected function of Microprocessor in mRNA stability control.

Subject(s)

Gene Expression Regulation , Proteins/genetics , RNA Stability , Ribonuclease III/genetics , Animals , Base Sequence , Cell Line , Humans , Molecular Sequence Data , Nucleic Acid Conformation , Proteins/metabolism , RNA Interference , RNA-Binding Proteins , Ribonuclease III/metabolism

13.

The UCSC Genome Browser database: 2022 update.

Lee, Brian T; Barber, Galt P; Benet-Pagès, Anna; Casper, Jonathan; Clawson, Hiram; Diekhans, Mark; Fischer, Clay; Gonzalez, Jairo Navarro; Hinrichs, Angie S; Lee, Christopher M; Muthuraman, Pranav; Nassar, Luis R; Nguy, Beagan; Pereira, Tiana; Perez, Gerardo; Raney, Brian J; Rosenbloom, Kate R; Schmelter, Daniel; Speir, Matthew L; Wick, Brittney D; Zweig, Ann S; Haussler, David; Kuhn, Robert M; Haeussler, Maximilian; Kent, W James.

Nucleic Acids Res ; 50(D1): D1115-D1122, 2022 01 07.

Article in English | MEDLINE | ID: mdl-34718705

ABSTRACT

The UCSC Genome Browser, https://genome.ucsc.edu, is a graphical viewer for exploring genome annotations. The website provides integrated tools for visualizing, comparing, analyzing, and sharing both publicly available and user-generated genomic datasets. Data highlights this year include a collection of easily accessible public hub assemblies on new organisms, now featuring BLAT alignment and PCR capabilities, and new and updated clinical tracks (gnomAD, DECIPHER, CADD, REVEL). We introduced a new Track Sets feature and enhanced variant displays to aid in the interpretation of clinical data. We also added a tool to rapidly place new SARS-CoV-2 genomes in a global phylogenetic tree enabling researchers to view the context of emerging mutations in our SARS-CoV-2 Genome Browser. Other new software focuses on usability features, including more informative mouseover displays and new fonts.

Subject(s)

Databases, Genetic , Web Browser , Animals , Genome, Human , Humans , Phylogeny , Polymerase Chain Reaction , SARS-CoV-2/genetics , User-Computer Interface , Exome Sequencing

14.

Author Correction: Comparative and demographic analysis of orang-utan genomes.

Locke, Devin P; Hillier, LaDeana W; Warren, Wesley C; Worley, Kim C; Nazareth, Lynne V; Muzny, Donna M; Yang, Shiaw-Pyng; Wang, Zhengyuan; Chinwalla, Asif T; Minx, Pat; Mitreva, Makedonka; Cook, Lisa; Delehaunty, Kim D; Fronick, Catrina; Schmidt, Heather; Fulton, Lucinda A; Fulton, Robert S; Nelson, Joanne O; Magrini, Vincent; Pohl, Craig; Graves, Tina A; Markovic, Chris; Cree, Andy; Dinh, Huyen H; Hume, Jennifer; Kovar, Christie L; Fowler, Gerald R; Lunter, Gerton; Meader, Stephen; Heger, Andreas; Ponting, Chris P; Marques-Bonet, Tomas; Alkan, Can; Chen, Lin; Cheng, Ze; Kidd, Jeffrey M; Eichler, Evan E; White, Simon; Searle, Stephen; Vilella, Albert J; Chen, Yuan; Flicek, Paul; Ma, Jian; Raney, Brian; Suh, Bernard; Burhans, Richard; Herrero, Javier; Haussler, David; Faria, Rui; Fernando, Olga.

Nature ; 608(7924): E36, 2022 Aug.

Article in English | MEDLINE | ID: mdl-35962045

15.

The UCSC Genome Browser database: 2021 update.

Navarro Gonzalez, Jairo; Zweig, Ann S; Speir, Matthew L; Schmelter, Daniel; Rosenbloom, Kate R; Raney, Brian J; Powell, Conner C; Nassar, Luis R; Maulding, Nathan D; Lee, Christopher M; Lee, Brian T; Hinrichs, Angie S; Fyfe, Alastair C; Fernandes, Jason D; Diekhans, Mark; Clawson, Hiram; Casper, Jonathan; Benet-Pagès, Anna; Barber, Galt P; Haussler, David; Kuhn, Robert M; Haeussler, Maximilian; Kent, W James.

Nucleic Acids Res ; 49(D1): D1046-D1057, 2021 01 08.

Article in English | MEDLINE | ID: mdl-33221922

ABSTRACT

For more than two decades, the UCSC Genome Browser database (https://genome.ucsc.edu) has provided high-quality genomics data visualization and genome annotations to the research community. As the field of genomics grows and more data become available, new modes of display are required to accommodate new technologies. New features released this past year include a Hi-C heatmap display, a phased family trio display for VCF files, and various track visualization improvements. Striving to keep data up-to-date, new updates to gene annotations include GENCODE Genes, NCBI RefSeq Genes, and Ensembl Genes. New data tracks added for human and mouse genomes include the ENCODE registry of candidate cis-regulatory elements, promoters from the Eukaryotic Promoter Database, and NCBI RefSeq Select and Matched Annotation from NCBI and EMBL-EBI (MANE). Within weeks of learning about the outbreak of coronavirus, UCSC released a genome browser, with detailed annotation tracks, for the SARS-CoV-2 RNA reference assembly.

Subject(s)

COVID-19/prevention & control , Computational Biology/methods , Databases, Genetic , Genome/genetics , Genomics/methods , SARS-CoV-2/genetics , Animals , COVID-19/epidemiology , COVID-19/virology , Data Curation/methods , Epidemics , Humans , Internet , Mice , Molecular Sequence Annotation/methods , SARS-CoV-2/physiology , Software

16.

Stability of SARS-CoV-2 phylogenies.

Turakhia, Yatish; De Maio, Nicola; Thornlow, Bryan; Gozashti, Landen; Lanfear, Robert; Walker, Conor R; Hinrichs, Angie S; Fernandes, Jason D; Borges, Rui; Slodkowicz, Greg; Weilguny, Lukas; Haussler, David; Goldman, Nick; Corbett-Detig, Russell.

PLoS Genet ; 16(11): e1009175, 2020 11.

Article in English | MEDLINE | ID: mdl-33206635

ABSTRACT

The SARS-CoV-2 pandemic has led to unprecedented, nearly real-time genetic tracing due to the rapid community sequencing response. Researchers immediately leveraged these data to infer the evolutionary relationships among viral samples and to study key biological questions, including whether host viral genome editing and recombination are features of SARS-CoV-2 evolution. This global sequencing effort is inherently decentralized and must rely on data collected by many labs using a wide variety of molecular and bioinformatic techniques. There is thus a strong possibility that systematic errors associated with lab-or protocol-specific practices affect some sequences in the repositories. We find that some recurrent mutations in reported SARS-CoV-2 genome sequences have been observed predominantly or exclusively by single labs, co-localize with commonly used primer binding sites and are more likely to affect the protein-coding sequences than other similarly recurrent mutations. We show that their inclusion can affect phylogenetic inference on scales relevant to local lineage tracing, and make it appear as though there has been an excess of recurrent mutation or recombination among viral lineages. We suggest how samples can be screened and problematic variants removed, and we plan to regularly inform the scientific community with our updated results as more SARS-CoV-2 genome sequences are shared (https://virological.org/t/issues-with-sars-cov-2-sequencing-data/473 and https://virological.org/t/masking-strategies-for-sars-cov-2-alignments/480). We also develop tools for comparing and visualizing differences among very large phylogenies and we show that consistent clade- and tree-based comparisons can be made between phylogenies produced by different groups. These will facilitate evolutionary inferences and comparisons among phylogenies produced for a wide array of purposes. Building on the SARS-CoV-2 Genome Browser at UCSC, we present a toolkit to compare, analyze and combine SARS-CoV-2 phylogenies, find and remove potential sequencing errors and establish a widely shared, stable clade structure for a more accurate scientific inference and discourse.

Subject(s)

Genome, Viral/genetics , Phylogeny , SARS-CoV-2/genetics , Algorithms , COVID-19 , Computational Biology , Evolution, Molecular , Humans , RNA, Viral/genetics , Sequence Alignment , Whole Genome Sequencing

17.

A Daily-Updated Database and Tools for Comprehensive SARS-CoV-2 Mutation-Annotated Trees.

McBroome, Jakob; Thornlow, Bryan; Hinrichs, Angie S; Kramer, Alexander; De Maio, Nicola; Goldman, Nick; Haussler, David; Corbett-Detig, Russell; Turakhia, Yatish.

Mol Biol Evol ; 38(12): 5819-5824, 2021 12 09.

Article in English | MEDLINE | ID: mdl-34469548

ABSTRACT

The vast scale of SARS-CoV-2 sequencing data has made it increasingly challenging to comprehensively analyze all available data using existing tools and file formats. To address this, we present a database of SARS-CoV-2 phylogenetic trees inferred with unrestricted public sequences, which we update daily to incorporate new sequences. Our database uses the recently proposed mutation-annotated tree (MAT) format to efficiently encode the tree with branches labeled with parsimony-inferred mutations, as well as Nextstrain clade and Pango lineage labels at clade roots. As of June 9, 2021, our SARS-CoV-2 MAT consists of 834,521 sequences and provides a comprehensive view of the virus' evolutionary history using public data. We also present matUtils-a command-line utility for rapidly querying, interpreting, and manipulating the MATs. Our daily-updated SARS-CoV-2 MAT database and matUtils software are available at http://hgdownload.soe.ucsc.edu/goldenPath/wuhCor1/UShER_SARS-CoV-2/ and https://github.com/yatisht/usher, respectively.

Subject(s)

Evolution, Molecular , Phylogeny , SARS-CoV-2 , COVID-19/virology , Humans , Mutation , SARS-CoV-2/genetics , Software

18.

Share pandemic sequences openly and fast.

Haussler, David; Haeussler, Max; Hinrichs, Angie; Corbett-Detig, Russell; Bjork, Isabel.

Nature ; 591(7849): 202, 2021 03.

Article in English | MEDLINE | ID: mdl-33658678

Subject(s)

Databases, Genetic , Genome, Viral/genetics , Information Dissemination , SARS-CoV-2/genetics , Animals , Birds/virology , COVID-19/epidemiology , COVID-19/prevention & control , COVID-19/virology , Databases, Genetic/legislation & jurisprudence , Humans , Influenza in Birds/virology , Information Dissemination/legislation & jurisprudence , Ownership/legislation & jurisprudence , Preprints as Topic , Sequence Analysis, DNA , Time Factors

19.

UCSC Genome Browser enters 20th year.

Lee, Christopher M; Barber, Galt P; Casper, Jonathan; Clawson, Hiram; Diekhans, Mark; Gonzalez, Jairo Navarro; Hinrichs, Angie S; Lee, Brian T; Nassar, Luis R; Powell, Conner C; Raney, Brian J; Rosenbloom, Kate R; Schmelter, Daniel; Speir, Matthew L; Zweig, Ann S; Haussler, David; Haeussler, Maximilian; Kuhn, Robert M; Kent, W James.

Nucleic Acids Res ; 48(D1): D756-D761, 2020 01 08.

Article in English | MEDLINE | ID: mdl-31691824

ABSTRACT

The University of California Santa Cruz Genome Browser website (https://genome.ucsc.edu) enters its 20th year of providing high-quality genomics data visualization and genome annotations to the research community. In the past year, we have added a new option to our web BLAT tool that allows search against all genomes, a single-cell expression viewer (https://cells.ucsc.edu), a 'lollipop' plot display mode for high-density variation data, a RESTful API for data extraction and a custom-track backup feature. New datasets include Tabula Muris single-cell expression data, GeneHancer regulatory annotations, The Cancer Genome Atlas Pan-Cancer variants, Genome Reference Consortium Patch sequences, new ENCODE transcription factor binding site peaks and clusters, the Database of Genomic Variants Gold Standard Variants, Genomenon Mastermind variants and three new multi-species alignment tracks.

Subject(s)

Databases, Genetic , Genome, Human , Software , Genomics , Humans , Internet

20.

Comparative Annotation Toolkit (CAT)-simultaneous clade and personal genome annotation.

Fiddes, Ian T; Armstrong, Joel; Diekhans, Mark; Nachtweide, Stefanie; Kronenberg, Zev N; Underwood, Jason G; Gordon, David; Earl, Dent; Keane, Thomas; Eichler, Evan E; Haussler, David; Stanke, Mario; Paten, Benedict.

Genome Res ; 28(7): 1029-1038, 2018 07.

Article in English | MEDLINE | ID: mdl-29884752

ABSTRACT

The recent introductions of low-cost, long-read, and read-cloud sequencing technologies coupled with intense efforts to develop efficient algorithms have made affordable, high-quality de novo sequence assembly a realistic proposition. The result is an explosion of new, ultracontiguous genome assemblies. To compare these genomes, we need robust methods for genome annotation. We describe the fully open source Comparative Annotation Toolkit (CAT), which provides a flexible way to simultaneously annotate entire clades and identify orthology relationships. We show that CAT can be used to improve annotations on the rat genome, annotate the great apes, annotate a diverse set of mammals, and annotate personal, diploid human genomes. We demonstrate the resulting discovery of novel genes, isoforms, and structural variants-even in genomes as well studied as rat and the great apes-and how these annotations improve cross-species RNA expression experiments.

Subject(s)

Genome, Human/genetics , Algorithms , Animals , High-Throughput Nucleotide Sequencing/methods , Humans , Molecular Sequence Annotation/methods , RNA/genetics , Rats

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

SEND TO:

SELECTION OF CITATIONS

SEARCH DETAIL