Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 40
Filter
Add more filters

Country/Region as subject
Publication year range
1.
J Med Genet ; 58(5): 314-325, 2021 05.
Article in English | MEDLINE | ID: mdl-32518176

ABSTRACT

BACKGROUND: The nucleotide binding protein-like (NUBPL) gene was first reported as a cause of mitochondrial complex I deficiency (MIM 613621, 618242) in 2010. To date, only eight patients have been reported with this mitochondrial disorder. Five other patients were recently reported to have NUBPL disease but their clinical picture was different from the first eight patients. Here, we report clinical and genetic findings in five additional patients (four families). METHODS: Whole exome sequencing was used to identify patients with compound heterozygous NUBPL variants. Functional studies included RNA-Seq transcript analyses, missense variant biochemical analyses in a yeast model (Yarrowia lipolytica) and mitochondrial respiration experiments on patient fibroblasts. RESULTS: The previously reported c.815-27T>C branch-site mutation was found in all four families. In prior patients, c.166G>A [p.G56R] was always found in cis with c.815-27T>C, but only two of four families had both variants. The second variant found in trans with c.815-27T>C in each family was: c.311T>C [p.L104P] in three patients, c.693+1G>A in one patient and c.545T>C [p.V182A] in one patient. Complex I function in the yeast model was impacted by p.L104P but not p.V182A. Clinical features include onset of neurological symptoms at 3-18 months, global developmental delay, cerebellar dysfunction (including ataxia, dysarthria, nystagmus and tremor) and spasticity. Brain MRI showed cerebellar atrophy. Mitochondrial function studies on patient fibroblasts showed significantly reduced spare respiratory capacity. CONCLUSION: We report on five new patients with NUBPL disease, adding to the number and phenotypic variability of patients diagnosed worldwide, and review prior reported patients with pathogenic NUBPL variants.


Subject(s)
Mitochondrial Diseases/genetics , Mitochondrial Proteins/genetics , Adolescent , Brain/diagnostic imaging , Child , DNA Mutational Analysis , Female , Humans , Magnetic Resonance Imaging , Male , Mitochondrial Diseases/diagnostic imaging , Mitochondrial Diseases/physiopathology , Pedigree , RNA-Seq , Exome Sequencing , Young Adult
2.
J Med Genet ; 57(1): 62-69, 2020 Jan.
Article in English | MEDLINE | ID: mdl-31391288

ABSTRACT

BACKGROUND: Pathogenic variants in mismatch repair (MMR) genes (MLH1, MSH2, MSH6 and PMS2) increase risk for Lynch syndrome and related cancers. We quantified tumour characteristics to assess variant pathogenicity for germline MMR genes. METHODS: Among 4740 patients with cancer with microsatellite instability (MSI) and immunohistochemical (IHC) results, we tested MMR pathogenic variant association with MSI/IHC status, and estimated likelihood ratios which we used to compute a tumour characteristic likelihood ratio (TCLR) for each variant. Predictive performance of TCLR in combination with in silico predictors, and a multifactorial variant prediction (MVP) model that included allele frequency, co-occurrence, co-segregation, and clinical and family history information was assessed. RESULTS: Compared with non-carriers, carriers of germline pathogenic/likely pathogenic (P/LP) variants were more likely to have abnormal MSI/IHC status (p<0.0001). Among 150 classified missense variants, 73.3% were accurately predicted with TCLR alone. Models leveraging in silico scores as prior probabilities accurately classified >76.7% variants. Adding TCLR as quantitative evidence in an MVP model (MVP +TCLR Pred) increased the proportion of accurately classified variants from 88.0% (MVP alone) to 98.0% and generated optimal performance statistics among all models tested. Importantly, MVP +TCLR Pred resulted in the high yield of predicted classifications for missense variants of unknown significance (VUS); among 193 VUS, 62.7% were predicted as P/PL or benign/likely benign (B/LB) when assessed according to American College of Medical Genetics and Genomics/Association for Molecular Pathology guidelines. CONCLUSION: Our study demonstrates that when used separately or in conjunction with other evidence, tumour characteristics provide evidence for germline MMR missense variant assessment, which may have important implications for genetic testing and clinical management.


Subject(s)
DNA Mismatch Repair , Mutation, Missense , Neoplasms/genetics , Colorectal Neoplasms, Hereditary Nonpolyposis , Computer Simulation , DNA-Binding Proteins/genetics , Female , Genetic Predisposition to Disease , Germ-Line Mutation , Humans , Male , Microsatellite Instability , Middle Aged , Mismatch Repair Endonuclease PMS2/genetics , MutL Protein Homolog 1/genetics , MutS Homolog 2 Protein/genetics , Neoplasms/metabolism
3.
J Med Genet ; 56(7): 453-460, 2019 07.
Article in English | MEDLINE | ID: mdl-30890586

ABSTRACT

BACKGROUND: PALB2 monoallelic loss-of-function germ-line variants confer a breast cancer risk comparable to the average BRCA2 pathogenic variant. Recommendations for risk reduction strategies in carriers are similar. Elaborating robust criteria to identify loss-of-function variants in PALB2-without incurring overprediction-is thus of paramount clinical relevance. Towards this aim, we have performed a comprehensive characterisation of alternative splicing in PALB2, analysing its relevance for the classification of truncating and splice site variants according to the 2015 American College of Medical Genetics and Genomics-Association for Molecular Pathology guidelines. METHODS: Alternative splicing was characterised in RNAs extracted from blood, breast and fimbriae/ovary-related human specimens (n=112). RNAseq, RT-PCR/CE and CloneSeq experiments were performed by five contributing laboratories. Centralised revision/curation was performed to assure high-quality annotations. Additional splicing analyses were performed in PALB2 c.212-1G>A, c.1684+1G>A, c.2748+2T>G, c.3113+5G>A, c.3350+1G>A, c.3350+4A>C and c.3350+5G>A carriers. The impact of the findings on PVS1 status was evaluated for truncating and splice site variant. RESULTS: We identified 88 naturally occurring alternative splicing events (81 newly described), including 4 in-frame events predicted relevant to evaluate PVS1 status of splice site variants. We did not identify tissue-specific alternate gene transcripts in breast or ovarian-related samples, supporting the clinical relevance of blood-based splicing studies. CONCLUSIONS: PVS1 is not necessarily warranted for splice site variants targeting four PALB2 acceptor sites (exons 2, 5, 7 and 10). As a result, rare variants at these splice sites cannot be assumed pathogenic/likely pathogenic without further evidences. Our study puts a warning in up to five PALB2 genetic variants that are currently reported as pathogenic/likely pathogenic in ClinVar.


Subject(s)
Alternative Splicing , Fanconi Anemia Complementation Group N Protein/genetics , Genetic Association Studies , Genetic Predisposition to Disease , Alleles , Gene Expression Profiling , Genetic Association Studies/methods , Germ-Line Mutation , Humans , Mutation , Neoplasms/diagnosis , Neoplasms/genetics , Nonsense Mediated mRNA Decay , RNA Splice Sites
4.
Genet Med ; 21(7): 1603-1610, 2019 07.
Article in English | MEDLINE | ID: mdl-30563988

ABSTRACT

PURPOSE: Structural variation (SV) is associated with inherited diseases. Next-generation sequencing (NGS) is an efficient method for SV detection because of its high-throughput, low cost, and base-pair resolution. However, due to lack of standard NGS protocols and a limited number of clinical samples with pathogenic SVs, comprehensive standards for SV detection, interpretation, and reporting are to be established. METHODS: We performed SV assessment on 60,000 clinical samples tested with hereditary cancer NGS panels spanning 48 genes. To evaluate NGS results, NGS and orthogonal methods were used separately in a blinded fashion for SV detection in all samples. RESULTS: A total of 1,037 SVs in coding sequence (CDS) or untranslated regions (UTRs) and 30,847 SVs in introns were detected and validated. Across all variant types, NGS shows 100% sensitivity and 99.9% specificity. Overall, 64% of CDS/UTR SVs were classified as pathogenic/likely pathogenic, and five deletions/duplications were reclassified as pathogenic using breakpoint information from NGS. CONCLUSION: The SVs presented here can be used as a valuable resource for clinical research and diagnostics. The data illustrate NGS as a powerful tool for SV detection. Application of NGS and confirmation technologies in genetic testing ensures delivering accurate and reliable results for diagnosis and patient care.


Subject(s)
Genetic Testing , High-Throughput Nucleotide Sequencing , Neoplasms/genetics , Humans , Neoplasms/diagnosis , Pseudogenes , Sensitivity and Specificity
5.
Genet Med ; 20(9): 1099-1102, 2018 09.
Article in English | MEDLINE | ID: mdl-29388939

ABSTRACT

In the published version of this paper, some of the columns in the last three rows of Table 3 were mistakenly transposed. The corrected table appears below. In col. 6 of the row for DNMT3A, "S3" was published in the original article. However, in the revised table for the corrigendum, it has been corrected to "S1". In col. 6 of the row for SON, "S3" was published in the original article. However, in the revised table for the corrigendum, it has been corrected to "S2".

6.
Genet Med ; 19(2): 224-235, 2017 02.
Article in English | MEDLINE | ID: mdl-27513193

ABSTRACT

PURPOSE: Diagnostic exome sequencing (DES) is now a commonly ordered test for individuals with undiagnosed genetic disorders. In addition to providing a diagnosis for characterized diseases, exome sequencing has the capacity to uncover novel candidate genes for disease. METHODS: Family-based DES included analysis of both characterized and novel genetic etiologies. To evaluate candidate genes for disease in the clinical setting, we developed a systematic, rule-based classification schema. RESULTS: Testing identified a candidate gene among 7.7% (72/934) of patients referred for DES; 37 (4.0%) and 35 (3.7%) of the genes received evidence scores of "candidate" and "suspected candidate," respectively. A total of 71 independent candidate genes were reported among the 72 patients, and 38% (27/71) were subsequently corroborated in the peer-reviewed literature. This rate of corroboration increased to 51.9% (27/52) among patients whose gene was reported at least 12 months previously. CONCLUSIONS: Herein, we provide transparent, comprehensive, and standardized scoring criteria for the clinical reporting of candidate genes. These results demonstrate that DES is an integral tool for genetic diagnosis, especially for elucidating the molecular basis for both characterized and novel candidate genetic etiologies. Gene discoveries also advance the understanding of normal human biology and more common diseases.Genet Med 19 2, 224-235.


Subject(s)
Exome Sequencing , Genetic Association Studies , Genetic Diseases, Inborn/diagnosis , Genetic Diseases, Inborn/genetics , Databases, Genetic , Exome/genetics , Genetic Diseases, Inborn/pathology , High-Throughput Nucleotide Sequencing/methods , Humans , Mutation
7.
Brief Bioinform ; 13(6): 656-68, 2012 Nov.
Article in English | MEDLINE | ID: mdl-22772836

ABSTRACT

The rapid advances of high-throughput sequencing technologies dramatically prompted metagenomic studies of microbial communities that exist at various environments. Fundamental questions in metagenomics include the identities, composition and dynamics of microbial populations and their functions and interactions. However, the massive quantity and the comprehensive complexity of these sequence data pose tremendous challenges in data analysis. These challenges include but are not limited to ever-increasing computational demand, biased sequence sampling, sequence errors, sequence artifacts and novel sequences. Sequence clustering methods can directly answer many of the fundamental questions by grouping similar sequences into families. In addition, clustering analysis also addresses the challenges in metagenomics. Thus, a large redundant data set can be represented with a small non-redundant set, where each cluster can be represented by a single entry or a consensus. Artifacts can be rapidly detected through clustering. Errors can be identified, filtered or corrected by using consensus from sequences within clusters.


Subject(s)
Algorithms , Metagenome , Cluster Analysis , Metagenomics , Sequence Analysis, DNA
8.
Bioinformatics ; 29(1): 122-3, 2013 Jan 01.
Article in English | MEDLINE | ID: mdl-23044549

ABSTRACT

SUMMARY: Numerous metagenomics projects have produced tremendous amounts of sequencing data. Aligning these sequences to reference genomes is an essential analysis in metagenomics studies. Large-scale alignment data call for intuitive and efficient visualization tool. However, current tools such as various genome browsers are highly specialized to handle intraspecies mapping results. They are not suitable for alignment data in metagenomics, which are often interspecies alignments. We have developed a web browser-based desktop application for interactively visualizing alignment data of metagenomic sequences. This viewer is easy to use on all computer systems with modern web browsers and requires no software installation. AVAILABILITY: http://weizhongli-lab.org/mgaviewer


Subject(s)
Metagenomics/methods , Sequence Alignment/methods , Software , Computer Graphics , Genome , Humans , Internet
9.
Bioinformatics ; 28(23): 3150-2, 2012 Dec 01.
Article in English | MEDLINE | ID: mdl-23060610

ABSTRACT

SUMMARY: CD-HIT is a widely used program for clustering biological sequences to reduce sequence redundancy and improve the performance of other sequence analyses. In response to the rapid increase in the amount of sequencing data produced by the next-generation sequencing technologies, we have developed a new CD-HIT program accelerated with a novel parallelization strategy and some other techniques to allow efficient clustering of such datasets. Our tests demonstrated very good speedup derived from the parallelization for up to ∼24 cores and a quasi-linear speedup for up to ∼8 cores. The enhanced CD-HIT is capable of handling very large datasets in much shorter time than previous versions. AVAILABILITY: http://cd-hit.org. CONTACT: liwz@sdsc.edu SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Subject(s)
Computational Biology/methods , Sequence Analysis, Protein/methods , Software , Algorithms , Cluster Analysis
10.
Infect Immun ; 80(6): 2150-7, 2012 Jun.
Article in English | MEDLINE | ID: mdl-22493085

ABSTRACT

Helminth parasites ensure their survival by regulating host immunity through mechanisms that dampen inflammation. These properties have recently been exploited therapeutically to treat human diseases. The biocomplexity of the intestinal lumen suggests that interactions between the parasite and the intestinal microbiota would also influence inflammation. In this study, we characterized the microbiota in the porcine proximal colon in response to Trichuris suis (whipworm) infection using 16S rRNA gene-based and whole-genome shotgun (WGS) sequencing. A 21-day T. suis infection in four pigs induced a significant change in the composition of the proximal colon microbiota compared to that of three parasite-naive pigs. Among the 15 phyla identified, the abundances of Proteobacteria and Deferribacteres were changed in infected pigs. The abundances of approximately 13% of genera were significantly altered by infection. Changes in relative abundances of Succinivibrio and Mucispirillum, for example, may relate to alterations in carbohydrate metabolism and niche disruptions in mucosal interfaces induced by parasitic infection, respectively. Of note, infection by T. suis led to a significant shift in the metabolic potential of the proximal colon microbiota, where 26% of all metabolic pathways identified were affected. Besides carbohydrate metabolism, lysine biosynthesis was repressed as well. A metabolomic analysis of volatile organic compounds (VOCs) in the luminal contents showed a relative absence in infected pigs of cofactors for carbohydrate and lysine biosynthesis, as well as an accumulation of oleic acid, suggesting altered fatty acid absorption contributing to local inflammation. Our findings should facilitate development of strategies for parasitic control in pigs and humans.


Subject(s)
Bacteria/classification , Colon/microbiology , Swine Diseases/parasitology , Trichuriasis/veterinary , Trichuris/classification , Animals , Carbohydrate Metabolism , Cluster Analysis , Fatty Acids/metabolism , Female , Gastrointestinal Contents/chemistry , Inflammation/veterinary , Metabolomics , Principal Component Analysis , Swine , Swine Diseases/immunology , Trichuriasis/immunology , Trichuriasis/parasitology , Volatile Organic Compounds/chemistry
11.
Bioinformatics ; 27(12): 1704-5, 2011 Jun 15.
Article in English | MEDLINE | ID: mdl-21505035

ABSTRACT

SUMMARY: Fragment recruitment, a process of aligning sequencing reads to reference genomes, is a crucial step in metagenomic data analysis. The available sequence alignment programs are either slow or insufficient for recruiting metagenomic reads. We implemented an efficient algorithm, FR-HIT, for fragment recruitment. We applied FR-HIT and several other tools including BLASTN, MegaBLAST, BLAT, LAST, SSAHA2, SOAP2, BWA and BWA-SW to recruit four metagenomic datasets from different type of sequencers. On average, FR-HIT and BLASTN recruited significantly more reads than other programs, while FR-HIT is about two orders of magnitude faster than BLASTN. FR-HIT is slower than the fastest SOAP2, BWA and BWA-SW, but it recruited 1-5 times more reads. AVAILABILITY: http://weizhongli-lab.org/frhit.


Subject(s)
Metagenomics/methods , Sequence Alignment/methods , Software , Algorithms , Genome , Metagenomics/standards , Reference Standards , Sequence Alignment/standards , Sequence Analysis, DNA
12.
BMC Genomics ; 12: 444, 2011 Sep 07.
Article in English | MEDLINE | ID: mdl-21899761

ABSTRACT

BACKGROUND: The new field of metagenomics studies microorganism communities by culture-independent sequencing. With the advances in next-generation sequencing techniques, researchers are facing tremendous challenges in metagenomic data analysis due to huge quantity and high complexity of sequence data. Analyzing large datasets is extremely time-consuming; also metagenomic annotation involves a wide range of computational tools, which are difficult to be installed and maintained by common users. The tools provided by the few available web servers are also limited and have various constraints such as login requirement, long waiting time, inability to configure pipelines etc. RESULTS: We developed WebMGA, a customizable web server for fast metagenomic analysis. WebMGA includes over 20 commonly used tools such as ORF calling, sequence clustering, quality control of raw reads, removal of sequencing artifacts and contaminations, taxonomic analysis, functional annotation etc. WebMGA provides users with rapid metagenomic data analysis using fast and effective tools, which have been implemented to run in parallel on our local computer cluster. Users can access WebMGA through web browsers or programming scripts to perform individual analysis or to configure and run customized pipelines. WebMGA is freely available at http://weizhongli-lab.org/metagenomic-analysis. CONCLUSIONS: WebMGA offers to researchers many fast and unique tools and great flexibility for complex metagenomic data analysis.


Subject(s)
Internet , Metagenomics , Sequence Analysis, DNA/methods , Software , Cluster Analysis , Computational Biology/methods
13.
Article in English | MEDLINE | ID: mdl-32532877

ABSTRACT

Reticular dysgenesis is a form of severe combined immunodeficiency (SCID) caused by biallelic pathogenic variants in AK2 Here we present the case of a boy diagnosed with SCID following a positive newborn screen (NBS). Genetic testing revealed a homozygous variant: AK2 c.330 + 5G > A. In silico analyses predicted weakened native donor splice site. However, this variant was initially classified as a variant of uncertain significance (VUS) given lack of direct evidence. To determine the impact on splicing, we analyzed RNA from the proband and his parents, using massively parallel RNA-seq of cloned RT-PCR products. Analysis showed that c.330 + 5G > A results in exon 3 skipping, which encodes a critical region of the AK2 protein. With these results, the variant was upgraded to pathogenic, and the patient was given a diagnosis of reticular dysgenesis. Interpretation of VUS at noncanonical splice site nucleotides presents a challenge. RNA sequencing provides an ideal platform to perform qualitative and quantitative assessment of intronic VUS, which can lead to reclassification if a significant impact on mRNA is observed. Genetic disorders of hematopoiesis and immunity represent fruitful areas to apply RNA-based analysis for variant interpretation given the high expression of RNA in blood.


Subject(s)
Adenylate Kinase/genetics , Genetic Association Studies , Genetic Predisposition to Disease , Genetic Variation , Introns , Leukopenia/diagnosis , Leukopenia/genetics , Severe Combined Immunodeficiency/diagnosis , Severe Combined Immunodeficiency/genetics , Alleles , DNA Mutational Analysis , Exons , Humans , Infant , Infant, Newborn , Leukopenia/therapy , Male , Mutation , Peripheral Blood Stem Cell Transplantation , Phenotype , RNA Splicing , Severe Combined Immunodeficiency/therapy , Treatment Outcome
14.
Genome Med ; 12(1): 28, 2020 03 17.
Article in English | MEDLINE | ID: mdl-32183904

ABSTRACT

BACKGROUND: Classifying pathogenicity of missense variants represents a major challenge in clinical practice during the diagnoses of rare and genetic heterogeneous neurodevelopmental disorders (NDDs). While orthologous gene conservation is commonly employed in variant annotation, approximately 80% of known disease-associated genes belong to gene families. The use of gene family information for disease gene discovery and variant interpretation has not yet been investigated on a genome-wide scale. We empirically evaluate whether paralog-conserved or non-conserved sites in human gene families are important in NDDs. METHODS: Gene family information was collected from Ensembl. Paralog-conserved sites were defined based on paralog sequence alignments; 10,068 NDD patients and 2078 controls were statistically evaluated for de novo variant burden in gene families. RESULTS: We demonstrate that disease-associated missense variants are enriched at paralog-conserved sites across all disease groups and inheritance models tested. We developed a gene family de novo enrichment framework that identified 43 exome-wide enriched gene families including 98 de novo variant carrying genes in NDD patients of which 28 represent novel candidate genes for NDD which are brain expressed and under evolutionary constraint. CONCLUSION: This study represents the first method to incorporate gene family information into a statistical framework to interpret variant data for NDDs and to discover new NDD-associated genes.


Subject(s)
Developmental Disabilities/genetics , Genome-Wide Association Study/methods , Multigene Family , Mutation, Missense , Genetic Loci , Phylogeny , Sequence Homology
15.
NPJ Precis Oncol ; 4: 4, 2020.
Article in English | MEDLINE | ID: mdl-32133419

ABSTRACT

Germline variants in tumor suppressor genes (TSGs) can result in RNA mis-splicing and predisposition to cancer. However, identification of variants that impact splicing remains a challenge, contributing to a substantial proportion of patients with suspected hereditary cancer syndromes remaining without a molecular diagnosis. To address this, we used capture RNA-sequencing (RNA-seq) to generate a splicing profile of 18 TSGs (APC, ATM, BRCA1, BRCA2, BRIP1, CDH1, CHEK2, MLH1, MSH2, MSH6, MUTYH, NF1, PALB2, PMS2, PTEN, RAD51C, RAD51D, and TP53) in 345 whole-blood samples from healthy donors. We subsequently demonstrated that this approach can detect mis-splicing by comparing splicing profiles from the control dataset to profiles generated from whole blood of individuals previously identified with pathogenic germline splicing variants in these genes. To assess the utility of our TSG splicing profile to prospectively identify pathogenic splicing variants, we performed concurrent capture DNA and RNA-seq in a cohort of 1000 patients with suspected hereditary cancer syndromes. This approach improved the diagnostic yield in this cohort, resulting in a 9.1% relative increase in the detection of pathogenic variants, demonstrating the utility of performing simultaneous DNA and RNA genetic testing in a clinical context.

16.
Bioinformatics ; 24(7): 924-31, 2008 Apr 01.
Article in English | MEDLINE | ID: mdl-18296462

ABSTRACT

MOTIVATION: Pair-wise residue-residue contacts in proteins can be predicted from both threading templates and sequence-based machine learning. However, most structure modeling approaches only use the template-based contact predictions in guiding the simulations; this is partly because the sequence-based contact predictions are usually considered to be less accurate than that by threading. With the rapid progress in sequence databases and machine-learning techniques, it is necessary to have a detailed and comprehensive assessment of the contact-prediction methods in different template conditions. RESULTS: We develop two methods for protein-contact predictions: SVM-SEQ is a sequence-based machine learning approach which trains a variety of sequence-derived features on contact maps; SVM-LOMETS collects consensus contact predictions from multiple threading templates. We test both methods on the same set of 554 proteins which are categorized into 'Easy', 'Medium', 'Hard' and 'Very Hard' targets based on the evolutionary and structural distance between templates and targets. For the Easy and Medium targets, SVM-LOMETS obviously outperforms SVM-SEQ; but for the Hard and Very Hard targets, the accuracy of the SVM-SEQ predictions is higher than that of SVM-LOMETS by 12-25%. If we combine the SVM-SEQ and SVM-LOMETS predictions together, the total number of correctly predicted contacts in the Hard proteins will increase by more than 60% (or 70% for the long-range contact with a sequence separation > or =24), compared with SVM-LOMETS alone. The advantage of SVM-SEQ is also shown in the CASP7 free modeling targets where the SVM-SEQ is around four times more accurate than SVM-LOMETS in the long-range contact prediction. These data demonstrate that the state-of-the-art sequence-based contact prediction has reached a level which may be helpful in assisting tertiary structure modeling for the targets which do not have close structure templates. The maximum yield should be obtained by the combination of both sequence- and template-based predictions.


Subject(s)
Algorithms , Amino Acids/chemistry , Models, Chemical , Models, Molecular , Proteins/chemistry , Sequence Analysis, Protein/methods , Amino Acid Sequence , Artificial Intelligence , Binding Sites , Computer Simulation , Molecular Sequence Data , Pattern Recognition, Automated/methods , Protein Binding , Protein Folding
17.
Nucleic Acids Res ; 35(10): 3375-82, 2007.
Article in English | MEDLINE | ID: mdl-17478507

ABSTRACT

We developed LOMETS, a local threading meta-server, for quick and automated predictions of protein tertiary structures and spatial constraints. Nine state-of-the-art threading programs are installed and run in a local computer cluster, which ensure the quick generation of initial threading alignments compared with traditional remote-server-based meta-servers. Consensus models are generated from the top predictions of the component-threading servers, which are at least 7% more accurate than the best individual servers based on TM-score at a t-test significance level of 0.1%. Moreover, side-chain and C-alpha (C(alpha)) contacts of 42 and 61% accuracy respectively, as well as long- and short-range distant maps, are automatically constructed from the threading alignments. These data can be easily used as constraints to guide the ab initio procedures such as TASSER for further protein tertiary structure modeling. The LOMETS server is freely available to the academic community at http://zhang.bioinformatics.ku.edu/LOMETS.


Subject(s)
Protein Structure, Tertiary , Sequence Alignment , Sequence Analysis, Protein/methods , Software , Algorithms , Amino Acid Sequence , Consensus Sequence , Models, Molecular
18.
JAMA Oncol ; 5(1): 51-57, 2019 01 01.
Article in English | MEDLINE | ID: mdl-30128536

ABSTRACT

Importance: Since the discovery of BRCA1 and BRCA2, multiple high- and moderate-penetrance genes have been reported as risk factors for hereditary breast cancer, ovarian cancer, or both; however, it is unclear whether these findings represent the complete genetic landscape of these cancers. Systematic investigation of the genetic contributions to breast and ovarian cancers is needed to confirm these findings and explore potentially new associations. Objective: To confirm reported and identify additional predisposition genes for breast or ovarian cancer. Design, Setting, and Participants: In this sample of 11 416 patients with clinical features of breast cancer, ovarian cancer, or both who were referred for genetic testing from 1200 hospitals and clinics across the United States and of 3988 controls who were referred for genetic testing for noncancer conditions between 2014 and 2015, whole-exome sequencing was conducted and gene-phenotype associations were examined. Case-control analyses using the Genome Aggregation Database as a set of reference controls were also conducted. Main Outcomes and Measures: Breast cancer risk associated with pathogenic variants among 625 cancer predisposition genes; association of identified predisposition breast or ovarian cancer genes with the breast cancer subtypes invasive ductal, invasive lobular, hormone receptor-positive, hormone receptor-negative, and male, and with early-onset disease. Results: Of 9639 patients with breast cancer, 3960 (41.1%) were early-onset cases (≤45 years at diagnosis) and 123 (1.3%) were male, with men having an older age at diagnosis than women (mean [SD] age, 61.8 [12.8] vs 48.6 [11.4] years). Of 2051 women with ovarian cancer, 445 (21.7%) received a diagnosis at 45 years or younger. Enrichment of pathogenic variants were identified in 4 non-BRCA genes associated with breast cancer risk: ATM (odds ratio [OR], 2.97; 95% CI, 1.67-5.68), CHEK2 (OR, 2.19; 95% CI, 1.40-3.56), PALB2 (OR, 5.53; 95% CI, 2.24-17.65), and MSH6 (OR, 2.59; 95% CI, 1.35-5.44). Increased risk for ovarian cancer was associated with 4 genes: MSH6 (OR, 4.16; 95% CI, 1.95-9.47), RAD51C (OR, not estimable; false-discovery rate-corrected P = .004), TP53 (OR, 18.50; 95% CI, 2.56-808.10), and ATM (OR, 2.85; 95% CI, 1.30-6.32). Neither the MRN complex genes nor CDKN2A was associated with increased breast or ovarian cancer risk. The findings also do not support previously reported breast cancer associations with the ovarian cancer susceptibility genes BRIP1, RAD51C, and RAD51D, or mismatch repair genes MSH2 and PMS2. Conclusions and Relevance: The results of this large-scale exome sequencing of patients and controls shed light on both well-established and controversial non-BRCA predisposition gene associations with breast or ovarian cancer reported to date and may implicate additional breast or ovarian cancer susceptibility gene candidates involved in DNA repair and genomic maintenance.


Subject(s)
Biomarkers, Tumor/genetics , Breast Neoplasms/genetics , Exome Sequencing , Ovarian Neoplasms/genetics , Adult , Aged , Breast Neoplasms/diagnosis , Breast Neoplasms, Male/genetics , Case-Control Studies , Female , Genetic Association Studies , Genetic Predisposition to Disease , Humans , Male , Middle Aged , Ovarian Neoplasms/diagnosis , Phenotype , Risk Assessment , Risk Factors , United States
19.
Sci Transl Med ; 11(521)2019 12 04.
Article in English | MEDLINE | ID: mdl-31801883

ABSTRACT

Hormonal therapy targeting androgen receptor (AR) is initially effective to treat prostate cancer (PCa), but it eventually fails. It has been hypothesized that cellular heterogeneity of PCa, consisting of AR+ luminal tumor cells and AR- neuroendocrine (NE) tumor cells, may contribute to therapy failure. Here, we describe the successful purification of NE cells from primary fresh human prostate adenocarcinoma based on the cell surface receptor C-X-C motif chemokine receptor 2 (CXCR2). Functional studies revealed CXCR2 to be a driver of the NE phenotype, including loss of AR expression, lineage plasticity, and resistance to hormonal therapy. CXCR2-driven NE cells were critical for the tumor microenvironment by providing a survival niche for the AR+ luminal cells. We demonstrate that the combination of CXCR2 inhibition and AR targeting is an effective treatment strategy in mouse xenograft models. Such a strategy has the potential to overcome therapy resistance caused by tumor cell heterogeneity.


Subject(s)
Drug Resistance, Neoplasm , Molecular Targeted Therapy , Prostatic Neoplasms/drug therapy , Receptors, Interleukin-8B/antagonists & inhibitors , Animals , Biomarkers, Tumor/metabolism , Cell Line, Tumor , Cell Membrane/metabolism , Disease Progression , Humans , Male , Mice, Nude , Neoplasm Grading , Neoplastic Stem Cells/pathology , Neovascularization, Pathologic/metabolism , Neovascularization, Pathologic/pathology , Neuroendocrine Tumors/blood supply , Neuroendocrine Tumors/drug therapy , Neuroendocrine Tumors/pathology , Neurosecretory Systems/pathology , Phenotype , Prostatic Neoplasms/blood supply , Prostatic Neoplasms/pathology , Receptors, Interleukin-8B/metabolism , Signal Transduction , Tumor Microenvironment
20.
Proteins ; 72(2): 547-56, 2008 Aug.
Article in English | MEDLINE | ID: mdl-18247410

ABSTRACT

We develop a new threading algorithm MUSTER by extending the previous sequence profile-profile alignment method, PPA. It combines various sequence and structure information into single-body terms which can be conveniently used in dynamic programming search: (1) sequence profiles; (2) secondary structures; (3) structure fragment profiles; (4) solvent accessibility; (5) dihedral torsion angles; (6) hydrophobic scoring matrix. The balance of the weighting parameters is optimized by a grading search based on the average TM-score of 111 training proteins which shows a better performance than using the conventional optimization methods based on the PROSUP database. The algorithm is tested on 500 nonhomologous proteins independent of the training sets. After removing the homologous templates with a sequence identity to the target >30%, in 224 cases, the first template alignment has the correct topology with a TM-score >0.5. Even with a more stringent cutoff by removing the templates with a sequence identity >20% or detectable by PSI-BLAST with an E-value <0.05, MUSTER is able to identify correct folds in 137 cases with the first model of TM-score >0.5. Dependent on the homology cutoffs, the average TM-score of the first threading alignments by MUSTER is 5.1-6.3% higher than that by PPA. This improvement is statistically significant by the Wilcoxon signed rank test with a P-value < 1.0 x 10(-13), which demonstrates the effect of additional structural information on the protein fold recognition. The MUSTER server is freely available to the academic community at http://zhang.bioinformatics.ku.edu/MUSTER.


Subject(s)
Proteins/chemistry , Sequence Alignment , Algorithms , Databases, Protein , Models, Molecular , Protein Conformation
SELECTION OF CITATIONS
SEARCH DETAIL