Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 14 de 14
Filtrar
1.
Hum Genet ; 141(10): 1615-1627, 2022 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-35347416

RESUMO

Infertility is a major reproductive health issue that affects about 12% of women of reproductive age in the United States. Aneuploidy in eggs accounts for a significant proportion of early miscarriage and in vitro fertilization failure. Recent studies have shown that genetic variants in several genes affect chromosome segregation fidelity and predispose women to a higher incidence of egg aneuploidy. However, the exact genetic causes of aneuploid egg production remain unclear, making it difficult to diagnose infertility based on individual genetic variants in mother's genome. In this study, we evaluated machine learning-based classifiers for predicting the embryonic aneuploidy risk in female IVF patients using whole-exome sequencing data. Using two exome datasets, we obtained an area under the receiver operating curve of 0.77 and 0.68, respectively. High precision could be traded off for high specificity in classifying patients by selecting different prediction score cutoffs. For example, a strict prediction score cutoff of 0.7 identified 29% of patients as high-risk with 94% precision. In addition, we identified MCM5, FGGY, and DDX60L as potential aneuploidy risk genes that contribute the most to the predictive power of the model. These candidate genes and their molecular interaction partners are enriched for meiotic-related gene ontology categories and pathways, such as microtubule organizing center and DNA recombination. In summary, we demonstrate that sequencing data can be mined to predict patients' aneuploidy risk thus improving clinical diagnosis. The candidate genes and pathways we identified are promising targets for future aneuploidy studies.


Assuntos
Infertilidade , Diagnóstico Pré-Implantação , Aneuploidia , DNA , Feminino , Fertilização in vitro , Humanos , Gravidez , Sequenciamento do Exoma
2.
Nucleic Acids Res ; 47(21): e142, 2019 12 02.
Artigo em Inglês | MEDLINE | ID: mdl-31584091

RESUMO

Evaluating the impact of non-synonymous genetic variants is essential for uncovering disease associations and mechanisms of evolution. An in-depth understanding of sequence changes is also fundamental for synthetic protein design and stability assessments. However, the variant effect predictor performance gain observed in recent years has not kept up with the increased complexity of new methods. One likely reason for this might be that most approaches use similar sets of gene and protein features for modeling variant effects, often emphasizing sequence conservation. While high levels of conservation highlight residues essential for protein activity, much of the variation observable in vivo is arguably weaker in its impact, thus requiring evaluation at a higher level of resolution. Here, we describe functionNeutral/Toggle/Rheostatpredictor (funtrp), a novel computational method that categorizes protein positions based on the position-specific expected range of mutational impacts: Neutral (weak/no effects), Rheostat (function-tuning positions), or Toggle (on/off switches). We show that position types do not correlate strongly with familiar protein features such as conservation or protein disorder. We also find that position type distribution varies across different protein functions. Finally, we demonstrate that position types can improve performance of existing variant effect predictors and suggest a way forward for the development of new ones.


Assuntos
Biologia Computacional/métodos , Sequência Conservada/genética , Mutação/genética , Proteínas , Sequência de Aminoácidos/genética , Sequência de Bases/genética , Bases de Dados de Proteínas , Humanos , Modelos Moleculares , Proteínas/química , Proteínas/genética , Relação Estrutura-Atividade
3.
Environ Microbiol ; 22(4): 1619-1634, 2020 04.
Artigo em Inglês | MEDLINE | ID: mdl-32090420

RESUMO

Mercury (Hg) is a highly toxic and widely distributed heavy metal, which some Bacteria and Archaea detoxify by the reduction of ionic Hg (Hg[II]) to the elemental volatile form, Hg(0). This activity is specified by the mer operon. The mer operon of the deeply branching thermophile Thermus thermophilus HB27 encodes for, an O-acetyl-l-homoacetylserine sulfhydrylase (Oah2), a transcriptional regulator (MerR), a hypothetical protein (hp) and a mercuric reductase (MerA). Here, we show that this operon has two convergently expressed and differentially regulated promoters. An upstream promoter, P oah , controls the constitutive transcription of the entire operon and a second promoter (P mer ), located within merR, is responsive to Hg(II). In the absence of Hg(II), the transcription of merA is basal and when Hg(II) is present, merA transcription is induced. This response to Hg(II) is controlled by MerR and genetic evidence suggests that MerR acts as a repressor and activator of P mer . When the whole merR, including P mer , is removed, merA is transcribed from P oah independently of Hg(II). These results suggest that the transcriptional regulation of mer in T. thermophilus is both similar to, and different from, the well-documented regulation of proteobacterial mer systems, possibly representing an early step in the evolution of mer-operon regulation.


Assuntos
Óperon , Thermus thermophilus/genética , Proteínas de Bactérias/genética , Proteínas de Ligação a DNA/genética , Mercúrio/metabolismo , Oxirredutases/genética , Regiões Promotoras Genéticas , Fatores de Transcrição/genética
4.
Nucleic Acids Res ; 46(D1): D535-D541, 2018 01 04.
Artigo em Inglês | MEDLINE | ID: mdl-29112720

RESUMO

Microbial functional diversification is driven by environmental factors, i.e. microorganisms inhabiting the same environmental niche tend to be more functionally similar than those from different environments. In some cases, even closely phylogenetically related microbes differ more across environments than across taxa. While microbial similarities are often reported in terms of taxonomic relationships, no existing databases directly link microbial functions to the environment. We previously developed a method for comparing microbial functional similarities on the basis of proteins translated from their sequenced genomes. Here, we describe fusionDB, a novel database that uses our functional data to represent 1374 taxonomically distinct bacteria annotated with available metadata: habitat/niche, preferred temperature, and oxygen use. Each microbe is encoded as a set of functions represented by its proteome and individual microbes are connected via common functions. Users can search fusionDB via combinations of organism names and metadata. Moreover, the web interface allows mapping new microbial genomes to the functional spectrum of reference bacteria, rendering interactive similarity networks that highlight shared functionality. fusionDB provides a fast means of comparing microbes, identifying potential horizontal gene transfer events, and highlighting key environment-specific functionality.


Assuntos
Bases de Dados Factuais , Microbiota/fisiologia , Bactérias/classificação , Bactérias/genética , Fenômenos Fisiológicos Bacterianos , Proteínas de Bactérias/genética , Proteínas de Bactérias/fisiologia , Biodiversidade , Bases de Dados Genéticas , Microbiologia Ambiental , Transferência Genética Horizontal , Humanos , Internet , Metadados , Metagenômica , Filogenia , Synechococcus/classificação , Synechococcus/genética , Synechococcus/fisiologia , Interface Usuário-Computador
5.
Nucleic Acids Res ; 46(4): e23, 2018 02 28.
Artigo em Inglês | MEDLINE | ID: mdl-29194524

RESUMO

The vast majority of microorganisms on Earth reside in often-inseparable environment-specific communities-microbiomes. Meta-genomic/-transcriptomic sequencing could reveal the otherwise inaccessible functionality of microbiomes. However, existing analytical approaches focus on attributing sequencing reads to known genes/genomes, often failing to make maximal use of available data. We created faser (functional annotation of sequencing reads), an algorithm that is optimized to map reads to molecular functions encoded by the read-correspondent genes. The mi-faser microbiome analysis pipeline, combining faser with our manually curated reference database of protein functions, accurately annotates microbiome molecular functionality. mi-faser's minutes-per-microbiome processing speed is significantly faster than that of other methods, allowing for large scale comparisons. Microbiome function vectors can be compared between different conditions to highlight environment-specific and/or time-dependent changes in functionality. Here, we identified previously unseen oil degradation-specific functions in BP oil-spill data, as well as functional signatures of individual-specific gut microbiome responses to a dietary intervention in children with Prader-Willi syndrome. Our method also revealed variability in Crohn's Disease patient microbiomes and clearly distinguished them from those of related healthy individuals. Our analysis highlighted the microbiome role in CD pathogenicity, demonstrating enrichment of patient microbiomes in functions that promote inflammation and that help bacteria survive it.


Assuntos
Metagenômica/métodos , Microbiota , Anotação de Sequência Molecular/métodos , Algoritmos , Proteínas de Bactérias/fisiologia , Criança , Doença de Crohn/microbiologia , Humanos , Síndrome de Prader-Willi/microbiologia , Alinhamento de Sequência
6.
Hum Mutat ; 40(9): 1486-1494, 2019 09.
Artigo em Inglês | MEDLINE | ID: mdl-31268618

RESUMO

The recent years have seen a drastic increase in the amount of available genomic sequences. Alongside this explosion, hundreds of computational tools were developed to assess the impact of observed genetic variation. Critical Assessment of Genome Interpretation (CAGI) provides a platform to evaluate the performance of these tools in experimentally relevant contexts. In the CAGI-5 challenge assessing the 38 missense variants affecting the human Pericentriolar material 1 protein (PCM1), our SNAP-based submission was the top performer, although it did worse than expected from other evaluations. Here, we compare the CAGI-5 submissions, and 24 additional commonly used variant effect predictors, to analyze the reasons for this observation. We identified per residue conservation, structural, and functional PCM1 characteristics, which may be responsible. As expected, predictors had a hard time distinguishing effect variants in nonconserved positions. They were also better able to call effect variants in a structurally rich region than in a less-structured one; in the latter, they more often correctly identified benign than effect variants. Curiously, most of the protein was predicted to be functionally robust to mutation-a feature that likely makes it a harder problem for generalized variant effect predictors.


Assuntos
Autoantígenos/genética , Proteínas de Ciclo Celular/genética , Biologia Computacional/métodos , Mutação de Sentido Incorreto , Algoritmos , Autoantígenos/metabolismo , Proteínas de Ciclo Celular/metabolismo , Bases de Dados Genéticas , Predisposição Genética para Doença , Humanos
7.
Hum Mutat ; 40(9): 1495-1506, 2019 09.
Artigo em Inglês | MEDLINE | ID: mdl-31184403

RESUMO

Thermodynamic stability is a fundamental property shared by all proteins. Changes in stability due to mutation are a widespread molecular mechanism in genetic diseases. Methods for the prediction of mutation-induced stability change have typically been developed and evaluated on incomplete and/or biased data sets. As part of the Critical Assessment of Genome Interpretation, we explored the utility of high-throughput variant stability profiling (VSP) assay data as an alternative for the assessment of computational methods and evaluated state-of-the-art predictors against over 7,000 nonsynonymous variants from two proteins. We found that predictions were modestly correlated with actual experimental values. Predictors fared better when evaluated as classifiers of extreme stability effects. While different methods emerging as top performers depending on the metric, it is nontrivial to draw conclusions on their adoption or improvement. Our analyses revealed that only 16% of all variants in VSP assays could be confidently defined as stability-affecting. Furthermore, it is unclear as to what extent VSP abundance scores were reasonable proxies for the stability-related quantities that participating methods were designed to predict. Overall, our observations underscore the need for clearly defined objectives when developing and using both computational and experimental methods in the context of measuring variant impact.


Assuntos
Biologia Computacional/métodos , Metiltransferases/química , Mutação , PTEN Fosfo-Hidrolase/química , Sequenciamento de Nucleotídeos em Larga Escala , Humanos , Metiltransferases/genética , PTEN Fosfo-Hidrolase/genética , Estabilidade Proteica
8.
Hum Mutat ; 40(9): 1474-1485, 2019 09.
Artigo em Inglês | MEDLINE | ID: mdl-31260570

RESUMO

The CAGI-5 pericentriolar material 1 (PCM1) challenge aimed to predict the effect of 38 transgenic human missense mutations in the PCM1 protein implicated in schizophrenia. Participants were provided with 16 benign variants (negative controls), 10 hypomorphic, and 12 loss of function variants. Six groups participated and were asked to predict the probability of effect and standard deviation associated to each mutation. Here, we present the challenge assessment. Prediction performance was evaluated using different measures to conclude in a final ranking which highlights the strengths and weaknesses of each group. The results show a great variety of predictions where some methods performed significantly better than others. Benign variants played an important role as negative controls, highlighting predictors biased to identify disease phenotypes. The best predictor, Bromberg lab, used a neural-network-based method able to discriminate between neutral and non-neutral single nucleotide polymorphisms. The CAGI-5 PCM1 challenge allowed us to evaluate the state of the art techniques for interpreting the effect of novel variants for a difficult target protein.


Assuntos
Autoantígenos/genética , Proteínas de Ciclo Celular/genética , Biologia Computacional/métodos , Mutação de Sentido Incorreto , Esquizofrenia/genética , Bases de Dados Genéticas , Predisposição Genética para Doença , Humanos , Redes Neurais de Computação , Fenótipo , Polimorfismo de Nucleotídeo Único
10.
J Clin Invest ; 131(11)2021 06 01.
Artigo em Inglês | MEDLINE | ID: mdl-33905375

RESUMO

Cancer-associated fibroblasts (CAF) may exert tumor-promoting and tumor-suppressive functions, but the mechanisms underlying these opposing effects remain elusive. Here, we sought to understand these potentially opposing functions by interrogating functional relationships among CAF subtypes, their mediators, desmoplasia, and tumor growth in a wide range of tumor types metastasizing to the liver, the most common organ site for metastasis. Depletion of hepatic stellate cells (HSC), which represented the main source of CAF in mice and patients in our study, or depletion of all CAF decreased tumor growth and mortality in desmoplastic colorectal and pancreatic metastasis but not in nondesmoplastic metastatic tumors. Single-cell RNA-Seq in conjunction with CellPhoneDB ligand-receptor analysis, as well as studies in immune cell-depleted and HSC-selective knockout mice, uncovered direct CAF-tumor interactions as a tumor-promoting mechanism, mediated by myofibroblastic CAF-secreted (myCAF-secreted) hyaluronan and inflammatory CAF-secreted (iCAF-secreted) HGF. These effects were opposed by myCAF-expressed type I collagen, which suppressed tumor growth by mechanically restraining tumor spread, overriding its own stiffness-induced mechanosignals. In summary, mechanical restriction by type I collagen opposes the overall tumor-promoting effects of CAF, thus providing a mechanistic explanation for their dual functions in cancer. Therapeutic targeting of tumor-promoting CAF mediators while preserving type I collagen may convert CAF from tumor promoting to tumor restricting.


Assuntos
Fibroblastos Associados a Câncer/metabolismo , Colágeno Tipo I/metabolismo , Células Estreladas do Fígado/metabolismo , Neoplasias Hepáticas Experimentais/metabolismo , Mecanotransdução Celular , Animais , Fibroblastos Associados a Câncer/patologia , Linhagem Celular Tumoral , Colágeno Tipo I/genética , Células Estreladas do Fígado/patologia , Humanos , Neoplasias Hepáticas Experimentais/genética , Neoplasias Hepáticas Experimentais/patologia , Camundongos Knockout , Metástase Neoplásica
11.
Microbiologyopen ; 9(9): e1100, 2020 09.
Artigo em Inglês | MEDLINE | ID: mdl-32762019

RESUMO

Microbes active in extreme cold are not as well explored as those of other extreme environments. Studies have revealed a substantial microbial diversity and identified cold-specific microbiome molecular functions. We analyzed the metagenomes and metatranscriptomes of 20 snow samples collected in early and late spring in Svalbard, Norway using mi-faser, our read-based computational microbiome function annotation tool. Our results reveal a more diverse microbiome functional capacity and activity in the early- vs. late-spring samples. We also find that functional dissimilarity between the same-sample metagenomes and metatranscriptomes is significantly higher in early than late spring samples. These findings suggest that early spring samples may contain a larger fraction of DNA of dormant (or dead) organisms, while late spring samples reflect a new, metabolically active community. We further show that the abundance of sequencing reads mapping to the fatty acid synthesis-related microbial pathways in late spring metagenomes and metatranscriptomes is significantly correlated with the organic acid levels measured in these samples. Similarly, the organic acid levels correlate with the pathway read abundances of geraniol degradation and inversely correlate with those of styrene degradation, suggesting a possible nutrient change. Our study thus highlights the activity of microbial degradation pathways of complex organic compounds previously unreported at low temperatures.


Assuntos
Bactérias/metabolismo , Microbiota/fisiologia , Compostos Orgânicos/metabolismo , Neve/microbiologia , Monoterpenos Acíclicos/metabolismo , Carbono/metabolismo , Ácidos Graxos/biossíntese , Redes e Vias Metabólicas , Metagenoma , Microbiota/genética , Noruega , Estações do Ano , Estireno/metabolismo , Transcriptoma
12.
Biol Direct ; 14(1): 19, 2019 10 30.
Artigo em Inglês | MEDLINE | ID: mdl-31666099

RESUMO

BACKGROUND: Accumulating evidence suggests that the human microbiome impacts individual and public health. City subway systems are human-dense environments, where passengers often exchange microbes. The MetaSUB project participants collected samples from subway surfaces in different cities and performed metagenomic sequencing. Previous studies focused on taxonomic composition of these microbiomes and no explicit functional analysis had been done till now. RESULTS: As a part of the 2018 CAMDA challenge, we functionally profiled the available ~ 400 subway metagenomes and built predictor for city origin. In cross-validation, our model reached 81% accuracy when only the top-ranked city assignment was considered and 95% accuracy if the second city was taken into account as well. Notably, this performance was only achievable if the similarity of distribution of cities in the training and testing sets was similar. To assure that our methods are applicable without such biased assumptions we balanced our training data to account for all represented cities equally well. After balancing, the performance of our method was slightly lower (76/94%, respectively, for one or two top ranked cities), but still consistently high. Here we attained an added benefit of independence of training set city representation. In testing, our unbalanced model thus reached (an over-estimated) performance of 90/97%, while our balanced model was at a more reliable 63/90% accuracy. While, by definition of our model, we were not able to predict the microbiome origins previously unseen, our balanced model correctly judged them to be NOT-from-training-cities over 80% of the time. Our function-based outlook on microbiomes also allowed us to note similarities between both regionally close and far-away cities. Curiously, we identified the depletion in mycobacterial functions as a signature of cities in New Zealand, while photosynthesis related functions fingerprinted New York, Porto and Tokyo. CONCLUSIONS: We demonstrated the power of our high-speed function annotation method, mi-faser, by analysing ~ 400 shotgun metagenomes in 2 days, with the results recapitulating functional signals of different city subway microbiomes. We also showed the importance of balanced data in avoiding over-estimated performance. Our results revealed similarities between both geographically close (Ofa and Ilorin) and distant (Boston and Porto, Lisbon and New York) city subway microbiomes. The photosynthesis related functional signatures of NYC were previously unseen in taxonomy studies, highlighting the strength of functional analysis.


Assuntos
Metagenoma , Microbiota , Ferrovias , Cidades
13.
Genome Med ; 11(1): 59, 2019 09 30.
Artigo em Inglês | MEDLINE | ID: mdl-31564248

RESUMO

BACKGROUND: After years of concentrated research efforts, the exact cause of Crohn's disease (CD) remains unknown. Its accurate diagnosis, however, helps in management and preventing the onset of disease. Genome-wide association studies have identified 241 CD loci, but these carry small log odds ratios and are thus diagnostically uninformative. METHODS: Here, we describe a machine learning method-AVA,Dx (Analysis of Variation for Association with Disease)-that uses exonic variants from whole exome or genome sequencing data to extract CD signal and predict CD status. Using the person-specific coding variation in genes from a panel of only 111 individuals, we built disease-prediction models informative of previously undiscovered disease genes. By additionally accounting for batch effects, we were able to accurately predict CD status for thousands of previously unseen individuals from other panels. RESULTS: AVA,Dx highlighted known CD genes including NOD2 and new potential CD genes. AVA,Dx identified 16% (at strict cutoff) of CD patients at 99% precision and 58% of the patients (at default cutoff) with 82% precision in over 3000 individuals from separately sequenced panels. CONCLUSIONS: Larger training panels and additional features, including other types of genetic variants and environmental factors, e.g., human-associated microbiota, may improve model performance. However, the results presented here already position AVA,Dx as both an effective method for revealing pathogenesis pathways and as a CD risk analysis tool, which can improve clinical diagnostic time and accuracy. Links to the AVA,Dx Docker image and the BitBucket source code are at https://bromberglab.org/project/avadx/ .


Assuntos
Doença de Crohn/diagnóstico , Exoma/genética , Marcadores Genéticos , Predisposição Genética para Doença , Metagenoma , Polimorfismo de Nucleotídeo Único , Doença de Crohn/genética , Doença de Crohn/microbiologia , Estudo de Associação Genômica Ampla , Humanos , Aprendizado de Máquina , Prognóstico
14.
J Integr Bioinform ; 14(2)2017 Jun 13.
Artigo em Inglês | MEDLINE | ID: mdl-28609295

RESUMO

With the advent of modern day high-throughput technologies, the bottleneck in biological discovery has shifted from the cost of doing experiments to that of analyzing results. clubber is our automated cluster-load balancing system developed for optimizing these "big data" analyses. Its plug-and-play framework encourages re-use of existing solutions for bioinformatics problems. clubber's goals are to reduce computation times and to facilitate use of cluster computing. The first goal is achieved by automating the balance of parallel submissions across available high performance computing (HPC) resources. Notably, the latter can be added on demand, including cloud-based resources, and/or featuring heterogeneous environments. The second goal of making HPCs user-friendly is facilitated by an interactive web interface and a RESTful API, allowing for job monitoring and result retrieval. We used clubber to speed up our pipeline for annotating molecular functionality of metagenomes. Here, we analyzed the Deepwater Horizon oil-spill study data to quantitatively show that the beach sands have not yet entirely recovered. Further, our analysis of the CAMI-challenge data revealed that microbiome taxonomic shifts do not necessarily correlate with functional shifts. These examples (21 metagenomes processed in 172 min) clearly illustrate the importance of clubber in the everyday computational biology environment.


Assuntos
Biologia Computacional/métodos , Metodologias Computacionais , Software , Automação , Praias , Conjuntos de Dados como Assunto , Golfo do México , Metagenoma/genética , Microbiota/genética , Anotação de Sequência Molecular , Poluição por Petróleo/efeitos adversos
SELEÇÃO DE REFERÊNCIAS
Detalhe da pesquisa