Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 63
Filtrar
1.
Cell ; 175(7): 1972-1988.e16, 2018 12 13.
Artigo em Inglês | MEDLINE | ID: mdl-30550791

RESUMO

In vitro cancer cultures, including three-dimensional organoids, typically contain exclusively neoplastic epithelium but require artificial reconstitution to recapitulate the tumor microenvironment (TME). The co-culture of primary tumor epithelia with endogenous, syngeneic tumor-infiltrating lymphocytes (TILs) as a cohesive unit has been particularly elusive. Here, an air-liquid interface (ALI) method propagated patient-derived organoids (PDOs) from >100 human biopsies or mouse tumors in syngeneic immunocompetent hosts as tumor epithelia with native embedded immune cells (T, B, NK, macrophages). Robust droplet-based, single-cell simultaneous determination of gene expression and immune repertoire indicated that PDO TILs accurately preserved the original tumor T cell receptor (TCR) spectrum. Crucially, human and murine PDOs successfully modeled immune checkpoint blockade (ICB) with anti-PD-1- and/or anti-PD-L1 expanding and activating tumor antigen-specific TILs and eliciting tumor cytotoxicity. Organoid-based propagation of primary tumor epithelium en bloc with endogenous immune stroma should enable immuno-oncology investigations within the TME and facilitate personalized immunotherapy testing.


Assuntos
Modelos Imunológicos , Neoplasias Experimentais/imunologia , Organoides/imunologia , Receptores de Antígenos de Linfócitos T/imunologia , Microambiente Tumoral/imunologia , Animais , Antígeno B7-H1/imunologia , Técnicas de Cocultura , Feminino , Humanos , Imunoterapia , Masculino , Camundongos , Camundongos Endogâmicos BALB C , Proteínas de Neoplasias/imunologia , Neoplasias Experimentais/patologia , Neoplasias Experimentais/terapia , Organoides/patologia
2.
Nature ; 622(7981): 41-47, 2023 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-37794265

RESUMO

Scientists have been trying to identify every gene in the human genome since the initial draft was published in 2001. In the years since, much progress has been made in identifying protein-coding genes, currently estimated to number fewer than 20,000, with an ever-expanding number of distinct protein-coding isoforms. Here we review the status of the human gene catalogue and the efforts to complete it in recent years. Beside the ongoing annotation of protein-coding genes, their isoforms and pseudogenes, the invention of high-throughput RNA sequencing and other technological breakthroughs have led to a rapid growth in the number of reported non-coding RNA genes. For most of these non-coding RNAs, the functional relevance is currently unclear; we look at recent advances that offer paths forward to identifying their functions and towards eventually completing the human gene catalogue. Finally, we examine the need for a universal annotation standard that includes all medically significant genes and maintains their relationships with different reference genomes for the use of the human gene catalogue in clinical settings.


Assuntos
Genes , Genoma Humano , Anotação de Sequência Molecular , Isoformas de Proteínas , Humanos , Genoma Humano/genética , Anotação de Sequência Molecular/normas , Anotação de Sequência Molecular/tendências , Isoformas de Proteínas/genética , Projeto Genoma Humano , Pseudogenes , RNA/genética
3.
Mol Psychiatry ; 2024 May 23.
Artigo em Inglês | MEDLINE | ID: mdl-38783055

RESUMO

Pharmacogenomic testing has emerged as an aid in clinical decision making for psychiatric providers, but more data is needed regarding its utility in clinical practice and potential impact on patient care. In this cross-sectional study, we determined the real-world prevalence of pharmacogenomic actionability in patients receiving psychiatric care. Potential actionability was based on the prevalence of CYP2C19 and CYP2D6 phenotypes, including CYP2D6 allele-specific copy number variations (CNVs). Combined actionability additionally incorporated CYP2D6 phenoconversion and the novel CYP2C-TG haplotype in patients with available medication data. Across 15,000 patients receiving clinical pharmacogenomic testing, 65% had potentially actionable CYP2D6 and CYP2C19 phenotypes, and phenotype assignment was impacted by CYP2D6 allele-specific CNVs in 2% of all patients. Of 4114 patients with medication data, 42% had CYP2D6 phenoconversion from drug interactions and 20% carried a novel CYP2C haplotype potentially altering actionability. A total of 87% had some form of potential actionability from genetic findings and/or phenoconversion. Genetic variation detected via next-generation sequencing led to phenotype reassignment in 22% of individuals overall (2% in CYP2D6 and 20% in CYP2C19). Ultimately, pharmacogenomic testing using next-generation sequencing identified potential actionability in most patients receiving psychiatric care. Early pharmacogenomic testing may provide actionable insights to aid clinicians in drug prescribing to optimize psychiatric care.

4.
Breast Cancer Res ; 25(1): 58, 2023 05 25.
Artigo em Inglês | MEDLINE | ID: mdl-37231433

RESUMO

BACKGROUND: Endocrine-resistant HR+/HER2- breast cancer (BC) and triple-negative BC (TNBC) are of interest for molecularly informed treatment due to their aggressive natures and limited treatment profiles. Patients of African Ancestry (AA) experience higher rates of TNBC and mortality than European Ancestry (EA) patients, despite lower overall BC incidence. Here, we compare the molecular landscapes of AA and EA patients with HR+/HER2- BC and TNBC in a real-world cohort to promote equity in precision oncology by illuminating the heterogeneity of potentially druggable genomic and transcriptomic pathways. METHODS: De-identified records from patients with TNBC or HR+/HER2- BC in the Tempus Database were randomly selected (N = 5000), with most having stage IV disease. Mutations, gene expression, and transcriptional signatures were evaluated from next-generation sequencing data. Genetic ancestry was estimated from DNA-seq. Differences in mutational prevalence, gene expression, and transcriptional signatures between AA and EA were compared. EA patients were used as the reference population for log fold-changes (logFC) in expression. RESULTS: After applying inclusion criteria, 3433 samples were evaluated (n = 623 AA and n = 2810 EA). Observed patterns of dysregulated pathways demonstrated significant heterogeneity among the two groups. Notably, PIK3CA mutations were significantly lower in AA HR+/HER2- tumors (AA = 34% vs. EA = 42%, P < 0.05) and the overall cohort (AA = 28% vs. EA = 37%, P = 2.08e-05). Conversely, KMT2C mutation was significantly more frequent in AA than EA TNBC (23% vs. 12%, P < 0.05) and HR+/HER2- (24% vs. 15%, P = 3e-03) tumors. Across all subtypes and stages, over 8000 genes were differentially expressed between the two ancestral groups including RPL10 (logFC = 2.26, P = 1.70e-162), HSPA1A (logFC = - 2.73, P = 2.43e-49), ATRX (logFC = - 1.93, P = 5.89e-83), and NUTM2F (logFC = 2.28, P = 3.22e-196). Ten differentially expressed gene sets were identified among stage IV HR+/HER2- tumors, of which four were considered relevant to BC treatment and were significantly enriched in EA: ERBB2_UP.V1_UP (P = 3.95e-06), LTE2_UP.V1_UP (P = 2.90e-05), HALLMARK_FATTY_ACID_METABOLISM (P = 0.0073), and HALLMARK_ANDROGEN_RESPONSE (P = 0.0074). CONCLUSIONS: We observed significant differences in mutational spectra, gene expression, and relevant transcriptional signatures between patients with genetically determined African and European ancestries, particularly within the HR+/HER2- BC and TNBC subtypes. These findings could guide future development of treatment strategies by providing opportunities for biomarker-informed research and, ultimately, clinical decisions for precision oncology care in diverse populations.


Assuntos
Neoplasias da Mama , Neoplasias de Mama Triplo Negativas , Feminino , Humanos , População Negra/genética , Neoplasias da Mama/etnologia , Neoplasias da Mama/patologia , Mutação , Medicina de Precisão , Neoplasias de Mama Triplo Negativas/etnologia , Neoplasias de Mama Triplo Negativas/patologia , População Branca
5.
Mod Pathol ; 33(8): 1546-1556, 2020 08.
Artigo em Inglês | MEDLINE | ID: mdl-32161378

RESUMO

In patients with invasive breast cancer, fluorescence in situ hybridization (FISH) testing for HER2 typically demonstrates the clear presence or lack of ERBB2 (HER2) amplification (i.e., groups 1 or 5). However, a small subset of patients can present with unusual HER2 FISH patterns (groups 2-4), resulting in diagnostic confusion. To provide clarity, the 2018 CAP/ASCO HER2 testing guideline recommends additional testing using HER2 immunohistochemistry (IHC) for determining the final HER2 status. Despite this effort, the genomic correlates of unusual HER2 FISH groups remain poorly understood. Here, we used droplet digital PCR (ddPCR) and targeted next-generation sequencing (NGS) to characterize the genomic features of both usual and unusual HER2 FISH groups. In this study, 51 clinical samples were selected to represent FISH groups 1-5. Furthermore, group 1 was subdivided into two groups, with groups 1A and 1B corresponding to cases with HER2 signals/cell ≥6.0 and 4-6, respectively. Overall, our findings revealed a wide range of copy number alterations in HER2 across the different FISH groups. As expected, groups 1A and 5 showed the clear presence and lack of HER2 copy number gain, respectively, as measured by ddPCR and NGS. In contrast, group 1B and other uncommon FISH groups (groups 2-4) were characterized by a broader range of HER2 copy levels with only a few select cases showing high-level gain. Notably, these cases with increased HER2 copy levels also showed HER2 overexpression by IHC, thus highlighting the correlation between HER2 copy number and HER2 protein expression. Given the concordance between the genomic and protein results, our findings suggest that HER2 IHC may inform HER2 copy number status in patients with unusual FISH patterns. Hence, our results support the current recommendation for using IHC to resolve HER2 status in FISH groups 2-4.


Assuntos
Biomarcadores Tumorais/análise , Neoplasias da Mama/genética , Hibridização in Situ Fluorescente/métodos , Receptor ErbB-2/genética , Adulto , Idoso , Idoso de 80 Anos ou mais , Biomarcadores Tumorais/genética , Variações do Número de Cópias de DNA , Feminino , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Humanos , Imuno-Histoquímica/métodos , Pessoa de Meia-Idade , Reação em Cadeia da Polimerase/métodos , Receptor ErbB-2/análise , Análise de Sequência de DNA/métodos
6.
Lancet Oncol ; 19(6): 785-798, 2018 06.
Artigo em Inglês | MEDLINE | ID: mdl-29753700

RESUMO

BACKGROUND: Medulloblastoma is associated with rare hereditary cancer predisposition syndromes; however, consensus medulloblastoma predisposition genes have not been defined and screening guidelines for genetic counselling and testing for paediatric patients are not available. We aimed to assess and define these genes to provide evidence for future screening guidelines. METHODS: In this international, multicentre study, we analysed patients with medulloblastoma from retrospective cohorts (International Cancer Genome Consortium [ICGC] PedBrain, Medulloblastoma Advanced Genomics International Consortium [MAGIC], and the CEFALO series) and from prospective cohorts from four clinical studies (SJMB03, SJMB12, SJYC07, and I-HIT-MED). Whole-genome sequences and exome sequences from blood and tumour samples were analysed for rare damaging germline mutations in cancer predisposition genes. DNA methylation profiling was done to determine consensus molecular subgroups: WNT (MBWNT), SHH (MBSHH), group 3 (MBGroup3), and group 4 (MBGroup4). Medulloblastoma predisposition genes were predicted on the basis of rare variant burden tests against controls without a cancer diagnosis from the Exome Aggregation Consortium (ExAC). Previously defined somatic mutational signatures were used to further classify medulloblastoma genomes into two groups, a clock-like group (signatures 1 and 5) and a homologous recombination repair deficiency-like group (signatures 3 and 8), and chromothripsis was investigated using previously established criteria. Progression-free survival and overall survival were modelled for patients with a genetic predisposition to medulloblastoma. FINDINGS: We included a total of 1022 patients with medulloblastoma from the retrospective cohorts (n=673) and the four prospective studies (n=349), from whom blood samples (n=1022) and tumour samples (n=800) were analysed for germline mutations in 110 cancer predisposition genes. In our rare variant burden analysis, we compared these against 53 105 sequenced controls from ExAC and identified APC, BRCA2, PALB2, PTCH1, SUFU, and TP53 as consensus medulloblastoma predisposition genes according to our rare variant burden analysis and estimated that germline mutations accounted for 6% of medulloblastoma diagnoses in the retrospective cohort. The prevalence of genetic predispositions differed between molecular subgroups in the retrospective cohort and was highest for patients in the MBSHH subgroup (20% in the retrospective cohort). These estimates were replicated in the prospective clinical cohort (germline mutations accounted for 5% of medulloblastoma diagnoses, with the highest prevalence [14%] in the MBSHH subgroup). Patients with germline APC mutations developed MBWNT and accounted for most (five [71%] of seven) cases of MBWNT that had no somatic CTNNB1 exon 3 mutations. Patients with germline mutations in SUFU and PTCH1 mostly developed infant MBSHH. Germline TP53 mutations presented only in childhood patients in the MBSHH subgroup and explained more than half (eight [57%] of 14) of all chromothripsis events in this subgroup. Germline mutations in PALB2 and BRCA2 were observed across the MBSHH, MBGroup3, and MBGroup4 molecular subgroups and were associated with mutational signatures typical of homologous recombination repair deficiency. In patients with a genetic predisposition to medulloblastoma, 5-year progression-free survival was 52% (95% CI 40-69) and 5-year overall survival was 65% (95% CI 52-81); these survival estimates differed significantly across patients with germline mutations in different medulloblastoma predisposition genes. INTERPRETATION: Genetic counselling and testing should be used as a standard-of-care procedure in patients with MBWNT and MBSHH because these patients have the highest prevalence of damaging germline mutations in known cancer predisposition genes. We propose criteria for routine genetic screening for patients with medulloblastoma based on clinical and molecular tumour characteristics. FUNDING: German Cancer Aid; German Federal Ministry of Education and Research; German Childhood Cancer Foundation (Deutsche Kinderkrebsstiftung); European Research Council; National Institutes of Health; Canadian Institutes for Health Research; German Cancer Research Center; St Jude Comprehensive Cancer Center; American Lebanese Syrian Associated Charities; Swiss National Science Foundation; European Molecular Biology Organization; Cancer Research UK; Hertie Foundation; Alexander and Margaret Stewart Trust; V Foundation for Cancer Research; Sontag Foundation; Musicians Against Childhood Cancer; BC Cancer Foundation; Swedish Council for Health, Working Life and Welfare; Swedish Research Council; Swedish Cancer Society; the Swedish Radiation Protection Authority; Danish Strategic Research Council; Swiss Federal Office of Public Health; Swiss Research Foundation on Mobile Communication; Masaryk University; Ministry of Health of the Czech Republic; Research Council of Norway; Genome Canada; Genome BC; Terry Fox Research Institute; Ontario Institute for Cancer Research; Pediatric Oncology Group of Ontario; The Family of Kathleen Lorette and the Clark H Smith Brain Tumour Centre; Montreal Children's Hospital Foundation; The Hospital for Sick Children: Sonia and Arthur Labatt Brain Tumour Research Centre, Chief of Research Fund, Cancer Genetics Program, Garron Family Cancer Centre, MDT's Garron Family Endowment; BC Childhood Cancer Parents Association; Cure Search Foundation; Pediatric Brain Tumor Foundation; Brainchild; and the Government of Ontario.


Assuntos
Biomarcadores Tumorais/genética , Neoplasias Cerebelares/genética , Metilação de DNA , Testes Genéticos/métodos , Mutação em Linhagem Germinativa , Meduloblastoma/genética , Modelos Genéticos , Adolescente , Adulto , Neoplasias Cerebelares/mortalidade , Neoplasias Cerebelares/patologia , Neoplasias Cerebelares/terapia , Criança , Pré-Escolar , Análise Mutacional de DNA , Feminino , Perfilação da Expressão Gênica , Predisposição Genética para Doença , Hereditariedade , Humanos , Lactente , Masculino , Meduloblastoma/mortalidade , Meduloblastoma/patologia , Meduloblastoma/terapia , Linhagem , Fenótipo , Valor Preditivo dos Testes , Intervalo Livre de Progressão , Estudos Prospectivos , Reprodutibilidade dos Testes , Estudos Retrospectivos , Fatores de Risco , Transcriptoma , Sequenciamento do Exoma , Adulto Jovem
7.
Bioinformatics ; 33(8): 1147-1153, 2017 04 15.
Artigo em Inglês | MEDLINE | ID: mdl-28035032

RESUMO

Motivation: Variant calling from next-generation sequencing (NGS) data is susceptible to false positive calls due to sequencing, mapping and other errors. To better distinguish true from false positive calls, we present a method that uses genotype array data from the sequenced samples, rather than public data such as HapMap or dbSNP, to train an accurate classifier using Random Forests. We demonstrate our method on a set of variant calls obtained from 642 African-ancestry genomes from the Consortium on Asthma among African-ancestry Populations in the Americas (CAAPA), sequenced to high depth (30X). Results: We have applied our classifier to compare call sets generated with different calling methods, including both single-sample and multi-sample callers. At a False Positive Rate of 5%, our method determines true positive rates of 97.5%, 95% and 99% on variant calls obtained using Illuminas single-sample caller CASAVA, Real Time Genomics multisample variant caller, and the GATK UnifiedGenotyper, respectively. Since NGS sequencing data may be accompanied by genotype data for the same samples, either collected concurrent to sequencing or from a previous study, our method can be trained on each dataset to provide a more accurate computational validation of site calls compared to generic methods. Moreover, our method allows for adjustment based on allele frequency (e.g. a different set of criteria to determine quality for rare versus common variants) and thereby provides insight into sequencing characteristics that indicate call quality for variants of different frequencies. Availability and Implementation: Code is available on Github at: https://github.com/suyashss/variant_validation. Contacts: suyashs@stanford.edu or mtaub@jhsph.edu. Supplementary information: Supplementary data are available at Bioinformatics online.


Assuntos
Sequenciamento de Nucleotídeos em Larga Escala/métodos , Polimorfismo de Nucleotídeo Único , Sequenciamento Completo do Genoma/métodos , Confiabilidade dos Dados , Genoma Humano , Genômica/métodos , Genômica/normas , Genótipo , Técnicas de Genotipagem/métodos , Técnicas de Genotipagem/normas , Sequenciamento de Nucleotídeos em Larga Escala/normas , Humanos , Sequenciamento Completo do Genoma/normas
8.
Proc Natl Acad Sci U S A ; 112(45): 13976-81, 2015 Nov 10.
Artigo em Inglês | MEDLINE | ID: mdl-26504226

RESUMO

Although a variety of genetic alterations have been found across cancer types, the identification and functional characterization of candidate driver genetic lesions in an individual patient and their translation into clinically actionable strategies remain major hurdles. Here, we use whole genome sequencing of a prostate cancer tumor, computational analyses, and experimental validation to identify and predict novel oncogenic activity arising from a point mutation in the phosphatase and tensin homolog (PTEN) tumor suppressor protein. We demonstrate that this mutation (p.A126G) produces an enzymatic gain-of-function in PTEN, shifting its function from a phosphoinositide (PI) 3-phosphatase to a phosphoinositide (PI) 5-phosphatase. Using cellular assays, we demonstrate that this gain-of-function activity shifts cellular phosphoinositide levels, hyperactivates the PI3K/Akt cell proliferation pathway, and exhibits increased cell migration beyond canonical PTEN loss-of-function mutants. These findings suggest that mutationally modified PTEN can actively contribute to well-defined hallmarks of cancer. Lastly, we demonstrate that these effects can be substantially mitigated through chemical PI3K inhibitors. These results demonstrate a new dysfunction paradigm for PTEN cancer biology and suggest a potential framework for the translation of genomic data into actionable clinical strategies for targeted patient therapy.


Assuntos
Genes Supressores de Tumor , Proteínas de Neoplasias/genética , PTEN Fosfo-Hidrolase/genética , Monoéster Fosfórico Hidrolases/genética , Neoplasias da Próstata/genética , Análise de Variância , Animais , Sequência de Bases , Células CHO , Movimento Celular/fisiologia , Proliferação de Células/fisiologia , Biologia Computacional/métodos , Cricetinae , Cricetulus , Humanos , Immunoblotting , Masculino , Microscopia de Fluorescência , Anotação de Sequência Molecular , Dados de Sequência Molecular , Mutagênese Sítio-Dirigida , Técnicas de Patch-Clamp , Fosfatidilinositóis/metabolismo , Monoéster Fosfórico Hidrolases/metabolismo , Análise de Sequência de DNA
9.
Nucleic Acids Res ; 42(11): 6921-34, 2014 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-24771338

RESUMO

Nucleosomes play important roles in a cell beyond their basal functionality in chromatin compaction. Their placement affects all steps in transcriptional regulation, from transcription factor (TF) binding to messenger ribonucleic acid (mRNA) synthesis. Careful profiling of their locations and dynamics in response to stimuli is important to further our understanding of transcriptional regulation by the state of chromatin. We measured nucleosome occupancy in human hepatic cells before and after treatment with transforming growth factor beta 1 (TGFß1), using massively parallel sequencing. With a newly developed method, SuMMIt, for precise positioning of nucleosomes we inferred dynamics of the nucleosomal landscape. Distinct nucleosome positioning has previously been described at transcription start site and flanking TF binding sites. We found that the average pattern is present at very few sites and, in case of TF binding, the double peak surrounding the sites is just an artifact of averaging over many loci. We systematically searched for depleted nucleosomes in stimulated cells compared to unstimulated cells and identified 24 318 loci. Depending on genomic annotation, 44-78% of them were over-represented in binding motifs for TFs. Changes in binding affinity were verified for HNF4α by qPCR. Strikingly many of these loci were associated with expression changes, as measured by RNA sequencing.


Assuntos
Nucleossomos/metabolismo , Fator de Crescimento Transformador beta1/farmacologia , Teorema de Bayes , Linhagem Celular , Regulação da Expressão Gênica , Fator 4 Nuclear de Hepatócito/metabolismo , Humanos , Nucleossomos/efeitos dos fármacos
11.
Am J Hum Genet ; 91(4): 660-71, 2012 Oct 05.
Artigo em Inglês | MEDLINE | ID: mdl-23040495

RESUMO

Full sequencing of individual human genomes has greatly expanded our understanding of human genetic variation and population history. Here, we present a systematic analysis of 50 human genomes from 11 diverse global populations sequenced at high coverage. Our sample includes 12 individuals who have admixed ancestry and who have varying degrees of recent (within the last 500 years) African, Native American, and European ancestry. We found over 21 million single-nucleotide variants that contribute to a 1.75-fold range in nucleotide heterozygosity across diverse human genomes. This heterozygosity ranged from a high of one heterozygous site per kilobase in west African genomes to a low of 0.57 heterozygous sites per kilobase in segments inferred to have diploid Native American ancestry from the genomes of Mexican and Puerto Rican individuals. We show evidence of all three continental ancestries in the genomes of Mexican, Puerto Rican, and African American populations, and the genome-wide statistics are highly consistent across individuals from a population once ancestry proportions have been accounted for. Using a generalized linear model, we identified subtle variations across populations in the proportion of neutral versus deleterious variation and found that genome-wide statistics vary in admixed populations even once ancestry proportions have been factored in. We further infer that multiple periods of gene flow shaped the diversity of admixed populations in the Americas-70% of the European ancestry in today's African Americans dates back to European gene flow happening only 7-8 generations ago.


Assuntos
Genoma Humano , Haplótipos/genética , População/genética , Grupos Raciais/genética , Genética Populacional/métodos , Heterozigoto , Humanos , Polimorfismo de Nucleotídeo Único
12.
Pac Symp Biocomput ; 29: 433-445, 2024.
Artigo em Inglês | MEDLINE | ID: mdl-38160297

RESUMO

The incompleteness of race and ethnicity information in real-world data (RWD) hampers its utility in promoting healthcare equity. This study introduces two methods-one heuristic and the other machine learning-based-to impute race and ethnicity from genetic ancestry using tumor profiling data. Analyzing de-identified data from over 100,000 cancer patients sequenced with the Tempus xT panel, we demonstrate that both methods outperform existing geolocation and surname-based methods, with the machine learning approach achieving high recall (range: 0.859-0.993) and precision (range: 0.932-0.981) across four mutually exclusive race and ethnicity categories. This work presents a novel pathway to enhance RWD utility in studying racial disparities in healthcare.


Assuntos
Etnicidade , Nomes , Humanos , Etnicidade/genética , Grupos Raciais/genética , Biologia Computacional , Testes Genéticos
13.
Pac Symp Biocomput ; 29: 322-326, 2024.
Artigo em Inglês | MEDLINE | ID: mdl-38160289

RESUMO

The following sections are included:OverviewDealing with the lack of diversity in current research datasetsDevelopment of fair machine learning algorithmsRace, genetic ancestry, and population structureConclusionAcknowledgments.


Assuntos
Biologia Computacional , Medicina de Precisão , Humanos , Aprendizado de Máquina , Desigualdades de Saúde
14.
PLoS Genet ; 6(8): e1001070, 2010 Aug 19.
Artigo em Inglês | MEDLINE | ID: mdl-20808890

RESUMO

The differentiation of cells into distinct cell types, each of which is heritable for many generations, underlies many biological phenomena. White and opaque cells of the fungal pathogen Candida albicans are two such heritable cell types, each thought to be adapted to unique niches within their human host. To systematically investigate their differences, we performed strand-specific, massively-parallel sequencing of RNA from C. albicans white and opaque cells. With these data we first annotated the C. albicans transcriptome, finding hundreds of novel differentially-expressed transcripts. Using the new annotation, we compared differences in transcript abundance between the two cell types with the genomic regions bound by a master regulator of the white-opaque switch (Wor1). We found that the revised transcriptional landscape considerably alters our understanding of the circuit governing differentiation. In particular, we can now resolve the poor concordance between binding of a master regulator and the differential expression of adjacent genes, a discrepancy observed in several other studies of cell differentiation. More than one third of the Wor1-bound differentially-expressed transcripts were previously unannotated, which explains the formerly puzzling presence of Wor1 at these positions along the genome. Many of these newly identified Wor1-regulated genes are non-coding and transcribed antisense to coding transcripts. We also find that 5' and 3' UTRs of mRNAs in the circuit are unusually long and that 5' UTRs often differ in length between cell-types, suggesting UTRs encode important regulatory information and that use of alternative promoters is widespread. Further analysis revealed that the revised Wor1 circuit bears several striking similarities to the Oct4 circuit that specifies the pluripotency of mammalian embryonic stem cells. Additional characteristics shared with the Oct4 circuit suggest a set of general hallmarks characteristic of heritable differentiation states in eukaryotes.


Assuntos
Candida albicans/citologia , Candida albicans/genética , Divisão Celular , Perfilação da Expressão Gênica , Candida albicans/metabolismo , Candidíase/microbiologia , Células-Tronco Embrionárias/microbiologia , Proteínas Fúngicas/genética , Proteínas Fúngicas/metabolismo , Regulação Fúngica da Expressão Gênica , Humanos , Transcrição Gênica
15.
Bioinform Adv ; 3(1): vbad062, 2023.
Artigo em Inglês | MEDLINE | ID: mdl-37416509

RESUMO

Summary: RNA sequencing (RNA-seq) can be applied to diverse tasks including quantifying gene expression, discovering quantitative trait loci and identifying gene fusion events. Although RNA-seq can detect germline variants, the complexities of variable transcript abundance, target capture and amplification introduce challenging sources of error. Here, we extend DeepVariant, a deep-learning-based variant caller, to learn and account for the unique challenges presented by RNA-seq data. Our DeepVariant RNA-seq model produces highly accurate variant calls from RNA-sequencing data, and outperforms existing approaches such as Platypus and GATK. We examine factors that influence accuracy, how our model addresses RNA editing events and how additional thresholding can be used to facilitate our models' use in a production pipeline. Supplementary information: Supplementary data are available at Bioinformatics Advances online.

16.
Pac Symp Biocomput ; 28: 181-185, 2023.
Artigo em Inglês | MEDLINE | ID: mdl-36540975

RESUMO

The following sections are included: Overview, Equitable risk prediction, Pharmacoequity, Race, genetic ancestry, and population structure, Conclusion, Acknowledgments, References.


Assuntos
Biologia Computacional , Medicina de Precisão , Humanos
17.
ArXiv ; 2023 Mar 24.
Artigo em Inglês | MEDLINE | ID: mdl-36994150

RESUMO

Scientists have been trying to identify all of the genes in the human genome since the initial draft of the genome was published in 2001. Over the intervening years, much progress has been made in identifying protein-coding genes, and the estimated number has shrunk to fewer than 20,000, although the number of distinct protein-coding isoforms has expanded dramatically. The invention of high-throughput RNA sequencing and other technological breakthroughs have led to an explosion in the number of reported non-coding RNA genes, although most of them do not yet have any known function. A combination of recent advances offers a path forward to identifying these functions and towards eventually completing the human gene catalogue. However, much work remains to be done before we have a universal annotation standard that includes all medically significant genes, maintains their relationships with different reference genomes, and describes clinically relevant genetic variants.

18.
Genome Res ; 19(9): 1527-41, 2009 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-19546169

RESUMO

We describe the genome sequencing of an anonymous individual of African origin using a novel ligation-based sequencing assay that enables a unique form of error correction that improves the raw accuracy of the aligned reads to >99.9%, allowing us to accurately call SNPs with as few as two reads per allele. We collected several billion mate-paired reads yielding approximately 18x haploid coverage of aligned sequence and close to 300x clone coverage. Over 98% of the reference genome is covered with at least one uniquely placed read, and 99.65% is spanned by at least one uniquely placed mate-paired clone. We identify over 3.8 million SNPs, 19% of which are novel. Mate-paired data are used to physically resolve haplotype phases of nearly two-thirds of the genotypes obtained and produce phased segments of up to 215 kb. We detect 226,529 intra-read indels, 5590 indels between mate-paired reads, 91 inversions, and four gene fusions. We use a novel approach for detecting indels between mate-paired reads that are smaller than the standard deviation of the insert size of the library and discover deletions in common with those detected with our intra-read approach. Dozens of mutations previously described in OMIM and hundreds of nonsynonymous single-nucleotide and structural variants in genes previously implicated in disease are identified in this individual. There is more genetic variation in the human genome still to be uncovered, and we provide guidance for future surveys in populations and cancer biopsies.


Assuntos
Pareamento de Bases , Biologia Computacional/métodos , Variação Genética , Genoma Humano , Ligases , Análise de Sequência de DNA/métodos , África , Sequência de Bases , Genômica , Genótipo , Heterozigoto , Homozigoto , Humanos , Polimorfismo de Nucleotídeo Único , Padrões de Referência
20.
Hum Hered ; 71(2): 113-25, 2011.
Artigo em Inglês | MEDLINE | ID: mdl-21734402

RESUMO

Genome-wide association studies (GWAS) have been successful in identifying common genetic variation reproducibly associated with disease. However, most associated variants confer very small risk and after meta-analysis of large cohorts a large fraction of expected heritability still remains unexplained. A possible explanation is that rare variants currently undetected by GWAS with SNP arrays could contribute a large fraction of risk when present in cases. This concept has spurred great interest in exploring the role of rare variants in disease. As the cost of sequencing continue to plummet, it is becoming feasible to directly sequence case-control samples for testing disease association including rare variants. We have developed a test statistic that allows for association testing among cases and controls using data directly from sequencing reads. In addition, our method allows for random errors in reads. We determine the probability of a true genotype call based on the observed base pair reads using the expectation-maximization algorithm. We apply the SumStat procedure to obtain a single statistic for a group of multiple rare variant loci. We document the validity of our method through simulations. Our results suggest that our statistic maintains the correct type I error rate, even in the presence of differential misclassification for sequence reads, and that it has good power under a number of scenarios. Finally, our SumStat results show power at least as good as the maximum single locus results.


Assuntos
Algoritmos , Predisposição Genética para Doença/genética , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Polimorfismo de Nucleotídeo Único , Sequência de Bases , Estudos de Casos e Controles , Frequência do Gene , Estudo de Associação Genômica Ampla , Genótipo , Haplótipos , Humanos , Homologia de Sequência do Ácido Nucleico
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA