Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 52
Filter
Add more filters

Country/Region as subject
Publication year range
1.
Proc Natl Acad Sci U S A ; 116(35): 17377-17382, 2019 08 27.
Article in English | MEDLINE | ID: mdl-31409704

ABSTRACT

Gross Chromosomal Rearrangements (GCRs) play an important role in human diseases, including cancer. Although most of the nonessential Genome Instability Suppressing (GIS) genes in Saccharomyces cerevisiae are known, the essential genes in which mutations can cause increased GCR rates are not well understood. Here 2 S. cerevisiae GCR assays were used to screen a targeted collection of temperature-sensitive mutants to identify mutations that caused increased GCR rates. This identified 94 essential GIS (eGIS) genes in which mutations cause increased GCR rates and 38 candidate eGIS genes that encode eGIS1 protein-interacting or family member proteins. Analysis of TCGA data using the human genes predicted to encode the proteins and protein complexes implicated by the S. cerevisiae eGIS genes revealed a significant enrichment of mutations affecting predicted human eGIS genes in 10 of the 16 cancers analyzed.


Subject(s)
Genes, Suppressor , Genome, Fungal , Genomic Instability , Neoplasms/genetics , Saccharomyces cerevisiae Proteins/genetics , Saccharomyces cerevisiae/genetics , Tumor Suppressor Proteins/genetics , DNA Damage , Humans , Mutation , Saccharomyces cerevisiae/metabolism , Saccharomyces cerevisiae Proteins/metabolism , Tumor Suppressor Proteins/metabolism
2.
PLoS Genet ; 9(1): e1003242, 2013.
Article in English | MEDLINE | ID: mdl-23359205

ABSTRACT

The era of whole-genome sequencing has revealed that gene copy-number changes caused by duplication and deletion events have important evolutionary, functional, and phenotypic consequences. Recent studies have therefore focused on revealing the extent of variation in copy-number within natural populations of humans and other species. These studies have found a large number of copy-number variants (CNVs) in humans, many of which have been shown to have clinical or evolutionary importance. For the most part, these studies have failed to detect an important class of gene copy-number polymorphism: gene duplications caused by retrotransposition, which result in a new intron-less copy of the parental gene being inserted into a random location in the genome. Here we describe a computational approach leveraging next-generation sequence data to detect gene copy-number variants caused by retrotransposition (retroCNVs), and we report the first genome-wide analysis of these variants in humans. We find that retroCNVs account for a substantial fraction of gene copy-number differences between any two individuals. Moreover, we show that these variants may often result in expressed chimeric transcripts, underscoring their potential for the evolution of novel gene functions. By locating the insertion sites of these duplicates, we are able to show that retroCNVs have had an important role in recent human adaptation, and we also uncover evidence that positive selection may currently be driving multiple retroCNVs toward fixation. Together these findings imply that retroCNVs are an especially important class of polymorphism, and that future studies of copy-number variation should search for these variants in order to illuminate their potential evolutionary and functional relevance.


Subject(s)
Computational Biology/methods , DNA Copy Number Variations/genetics , Gene Duplication , Retroelements/genetics , Base Sequence , Biological Evolution , Chromosome Mapping , Humans , Introns , Phenotype , Sequence Analysis, DNA , Sequence Deletion
3.
BMC Genomics ; 16: 536, 2015 Jul 22.
Article in English | MEDLINE | ID: mdl-26194008

ABSTRACT

BACKGROUND: Differences in gene expression have a significant role in the diversity of phenotypes in humans. Here we integrated human public data from ENCODE, 1000 Genomes and Geuvadis to explore the populational landscape of INDELs affecting transcription factor-binding sites (TFBS). A significant fraction of TFBS close to the transcription start site of known genes is affected by INDELs with a consequent effect at the expression of the associated gene. RESULTS: Hundreds of TFBS-affecting INDELs (TFBS-ID) show a differential frequency between human populations, suggesting a role of natural selection in the spread of such variant INDELs. A comparison with a dataset of known human genomic regions under natural selection allowed us to identify several cases of TFBS-ID likely involved in populational adaptations. Ontology analyses on the differential TFBS-ID further indicated several biological processes under natural selection in different populations. CONCLUSION: Together, our results strongly suggest that INDELs have an important role in modulating gene expression patterns in humans. The dataset we make available, together with other data reporting variability at both regulatory and coding regions of genes, represent a powerful tool for studies aiming to better understand the evolution of gene regulatory networks in humans.


Subject(s)
Binding Sites/genetics , Genome, Human , INDEL Mutation/genetics , Transcription Factors/genetics , Chromosome Mapping , Humans , Promoter Regions, Genetic , Protein Binding , Transcription Initiation Site
4.
Bioessays ; 34(8): 655-7, 2012 Aug.
Article in English | MEDLINE | ID: mdl-22528879

ABSTRACT

Domains can spread among proteins in a process called domain shuffling and this has been identified as one of the major mechanisms leading to the formation of new proteins throughout evolution. This process has an impact on the topology of protein-protein interaction networks as it may create new hubs and also increase interconnectivity.


Subject(s)
Evolution, Molecular , Protein Interaction Mapping , Proteins/chemistry , Animals , Humans , Introns , Protein Biosynthesis , Protein Interaction Maps , Species Specificity , Systems Biology
5.
J Mol Evol ; 76(4): 228-39, 2013 Apr.
Article in English | MEDLINE | ID: mdl-23529588

ABSTRACT

Despite evidence that at the interspecific scale, exonic splicing silencers (ESSs) are under negative selection in constitutive exons, little is known about the effects of slightly deleterious polymorphisms on these splicing regulators. Through the application of a modified version of the McDonald-Kreitman test, we compared the normalized proportions of human polymorphisms and human/rhesus substitutions affecting exonic splicing regulators (ESRs) on sequences of constitutive and alternative exons. Our results show a depletion of substitutions and an enrichment of SNPs associated with ESS gain in constitutive exons. Moreover, we show that this evolutionary pattern is also present in a set of ESRs previously involved in the transition from constitutive to skipped exons in the mammalian lineage. The similarity between these two sets of ESRs suggests that the transition from constitutive to skipped exons in mammals is more frequently associated with the inhibition than with the promotion of splicing signals. This is in accordance with the hypothesis of a constitutive origin of exon skipping and corroborates previous findings about the antagonistic role of certain exonic splicing enhancers.


Subject(s)
Biological Evolution , Exons , Polymorphism, Single Nucleotide , RNA Splicing , Regulatory Sequences, Nucleic Acid , Selection, Genetic , Animals , Enhancer Elements, Genetic , Humans , Mammals/genetics , Models, Genetic
6.
Nucleic Acids Res ; 39(12): 4942-8, 2011 Jul.
Article in English | MEDLINE | ID: mdl-21398627

ABSTRACT

With the availability of a large amount of genomic data it is expected that the influence of single nucleotide variations (SNVs) in many biological phenomena will be elucidated. Here, we approached the problem of how SNVs affect alternative splicing. First, we observed that SNVs and exonic splicing regulators (ESRs) independently show a biased distribution in alternative exons. More importantly, SNVs map more frequently in ESRs located in alternative exons than in ESRs located in constitutive exons. By looking at SNVs associated with alternative exon/intron borders (by their common presence in the same cDNA molecule), we observed that a specific type of ESR, the exonic splicing silencers (ESSs), are more frequently modified by SNVs. Our results establish a clear association between genetic diversity and alternative splicing involving ESSs.


Subject(s)
Alternative Splicing , Exons , Polymorphism, Single Nucleotide , Regulatory Sequences, Ribonucleic Acid , Humans , Introns
7.
Nucleic Acids Res ; 39(14): 6056-68, 2011 Aug.
Article in English | MEDLINE | ID: mdl-21493686

ABSTRACT

Although patterns of somatic alterations have been reported for tumor genomes, little is known on how they compare with alterations present in non-tumor genomes. A comparison of the two would be crucial to better characterize the genetic alterations driving tumorigenesis. We sequenced the genomes of a lymphoblastoid (HCC1954BL) and a breast tumor (HCC1954) cell line derived from the same patient and compared the somatic alterations present in both. The lymphoblastoid genome presents a comparable number and similar spectrum of nucleotide substitutions to that found in the tumor genome. However, a significant difference in the ratio of non-synonymous to synonymous substitutions was observed between both genomes (P = 0.031). Protein-protein interaction analysis revealed that mutations in the tumor genome preferentially affect hub-genes (P = 0.0017) and are co-selected to present synergistic functions (P < 0.0001). KEGG analysis showed that in the tumor genome most mutated genes were organized into signaling pathways related to tumorigenesis. No such organization or synergy was observed in the lymphoblastoid genome. Our results indicate that endogenous mutagens and replication errors can generate the overall number of mutations required to drive tumorigenesis and that it is the combination rather than the frequency of mutations that is crucial to complete tumorigenic transformation.


Subject(s)
Breast Neoplasms/genetics , Genetic Variation , Genome, Human , Cell Line, Transformed , Cell Line, Tumor , Chromosome Aberrations , Female , Humans , Lymphocytes , Middle Aged , Mutation , Point Mutation , Protein Interaction Mapping , Sequence Analysis, DNA
8.
Genetica ; 140(4-6): 249-57, 2012 Jun.
Article in English | MEDLINE | ID: mdl-22948334

ABSTRACT

Exon shuffling has been characterized as one of the major evolutionary forces shaping both the genome and the proteome of eukaryotes. This mechanism was particularly important in the creation of multidomain proteins during animal evolution, bringing a number of functional genetic novelties. Here, genome information from a variety of eukaryotic species was used to address several issues related to the evolutionary history of exon shuffling. By comparing all protein sequences within each species, we were able to characterize exon shuffling signatures throughout metazoans. Intron phase (the position of the intron regarding the codon) and exon symmetry (the pattern of flanking introns for a given exon or block of adjacent exons) were features used to evaluate exon shuffling. We confirmed previous observations that exon shuffling mediated by phase 1 introns (1-1 exon shuffling) is the predominant kind in multicellular animals. Evidence is provided that such pattern was achieved since the early steps of animal evolution, supported by a detectable presence of 1-1 shuffling units in Trichoplax adhaerens and a considerable prevalence of them in Nematostella vectensis. In contrast, Monosiga brevicollis, one of the closest relatives of metazoans, and Arabidopsis thaliana, showed no evidence of 1-1 exon or domain shuffling above what it would be expected by chance. Instead, exon shuffling events are less abundant and predominantly mediated by phase 0 introns (0-0 exon shuffling) in those non-metazoan species. Moreover, an intermediate pattern of 1-1 and 0-0 exon shuffling was observed for the placozoan T. adhaerens, a primitive animal. Finally, characterization of flanking intron phases around domain borders allowed us to identify a common set of symmetric 1-1 domains that have been shuffled throughout the metazoan lineage.


Subject(s)
Evolution, Molecular , Exons , Recombination, Genetic , Animals , Cluster Analysis , Computational Biology/methods , Humans , Introns , Open Reading Frames/genetics , Plants/genetics , Protein Interaction Domains and Motifs/genetics
9.
RNA Biol ; 9(11): 1339-43, 2012 Nov.
Article in English | MEDLINE | ID: mdl-23064119

ABSTRACT

Understanding alternative splicing is crucial to elucidate the mechanisms behind several biological phenomena, including diseases. The huge amount of expressed sequences available nowadays represents an opportunity and a challenge to catalog and display alternative splicing events (ASEs). Although several groups have faced this challenge with relative success, we still lack a computational tool that uses a simple and straightforward method to retrieve, name and present ASEs. Here we present SPLOOCE, a portal for the analysis of human splicing variants. SPLOOCE uses a method based on regular expressions for retrieval of ASEs. We propose a simple syntax that is able to capture the complexity of ASEs.


Subject(s)
Alternative Splicing , Computational Biology , Databases, Nucleic Acid , RNA Splice Sites , Humans , Internet , Oligonucleotide Array Sequence Analysis
10.
Proc Natl Acad Sci U S A ; 106(6): 1886-91, 2009 Feb 10.
Article in English | MEDLINE | ID: mdl-19181860

ABSTRACT

We have identified new genomic alterations in the breast cancer cell line HCC1954, using high-throughput transcriptome sequencing. With 120 Mb of cDNA sequences, we were able to identify genomic rearrangement events leading to fusions or truncations of genes including MRE11 and NSD1, genes already implicated in oncogenesis, and 7 rearrangements involving other additional genes. This approach demonstrates that high-throughput transcriptome sequencing is an effective strategy for the characterization of genomic rearrangements in cancers.


Subject(s)
Breast Neoplasms/genetics , Gene Expression Profiling/methods , Gene Rearrangement , Genome, Human/genetics , Base Sequence , Carrier Proteins/genetics , Cell Line, Tumor , DNA, Complementary , DNA-Binding Proteins/genetics , Female , Histone-Lysine N-Methyltransferase , Humans , MRE11 Homologue Protein , Neoplasm Proteins/genetics , Nuclear Proteins/genetics
11.
PLoS One ; 17(1): e0262419, 2022.
Article in English | MEDLINE | ID: mdl-35085295

ABSTRACT

Genetic predisposition accounts for nearly 10% of all melanoma cases and has been associated with a dozen moderate- to high-penetrance genes, including CDKN2A, CDK4, POT1 and BAP1. However, in most melanoma-prone families, the genetic etiology of cancer predisposition remains undetermined. The goal of this study was to identify rare genomic variants associated with cutaneous melanoma susceptibility in melanoma-prone families. Whole-exome sequencing was performed in 2 affected individuals of 5 melanoma-prone families negative for mutations in CDKN2A and CDK4, the major cutaneous melanoma risk genes. A total of 288 rare coding variants shared by the affected relatives of each family were identified, including 7 loss-of-function variants. By performing in silico analyses of gene function, biological pathways, and variant pathogenicity prediction, we underscored the putative role of several genes for melanoma risk, including previously described genes such as MYO7A and WRN, as well as new putative candidates, such as SERPINB4, HRNR, and NOP10. In conclusion, our data revealed rare germline variants in melanoma-prone families contributing with a novel set of potential candidate genes to be further investigated in future studies.


Subject(s)
Genetic Predisposition to Disease/genetics , Melanoma/genetics , Mutation/genetics , Skin Neoplasms/genetics , Adolescent , Adult , Aged , Brazil , Female , Genotype , Humans , Male , Middle Aged , Pedigree , Penetrance , Exome Sequencing/methods , Melanoma, Cutaneous Malignant
12.
Front Genet ; 12: 617915, 2021.
Article in English | MEDLINE | ID: mdl-33613639

ABSTRACT

Extended phenotypes are manifestations of genes that occur outside of the organism that possess those genes. In spite of their widespread occurrence, the role of extended phenotypes in evolutionary biology is still a matter of debate. Here, we explore the indirect effects of extended phenotypes, especially their shared use, in the fitness of simulated individuals and populations. A computer simulation platform was developed in which different populations were compared regarding their ability to produce, use, and share extended phenotypes. Our results show that populations that produce and share extended phenotypes outrun populations that only produce them. A specific parameter in the simulations, a bonus for sharing extended phenotypes among conspecifics, has a more significant impact in defining which population will prevail. All these findings strongly support the view, postulated by the extended fitness hypothesis (EFH) that extended phenotypes play a significant role at the population level and their shared use increases population fitness. Our simulation platform is available at https://github.com/guilherme-araujo/gsop-dist.

13.
BMC Genomics ; 11 Suppl 5: S11, 2010 Dec 22.
Article in English | MEDLINE | ID: mdl-21210967

ABSTRACT

BACKGROUND: Physical protein-protein interaction (PPI) is a critical phenomenon for the function of most proteins in living organisms and a significant fraction of PPIs are the result of domain-domain interactions. Exon shuffling, intron-mediated recombination of exons from existing genes, is known to have been a major mechanism of domain shuffling in metazoans. Thus, we hypothesized that exon shuffling could have a significant influence in shaping the topology of PPI networks. RESULTS: We tested our hypothesis by compiling exon shuffling and PPI data from six eukaryotic species: Homo sapiens, Mus musculus, Drosophila melanogaster, Caenorhabditis elegans, Cryptococcus neoformans and Arabidopsis thaliana. For all four metazoan species, genes enriched in exon shuffling events presented on average higher vertex degree (number of interacting partners) in PPI networks. Furthermore, we verified that a set of protein domains that are simultaneously promiscuous (known to interact to multiple types of other domains), self-interacting (able to interact with another copy of themselves) and abundant in the genomes presents a stronger signal for exon shuffling. CONCLUSIONS: Exon shuffling appears to have been a recurrent mechanism for the emergence of new PPIs along metazoan evolution. In metazoan genomes, exon shuffling also promoted the expansion of some protein domains. We speculate that their promiscuous and self-interacting properties may have been decisive for that expansion.


Subject(s)
Evolution, Molecular , Exons/genetics , Protein Binding/genetics , Protein Structure, Tertiary/genetics , Proteins/metabolism , Recombination, Genetic/genetics , Amyloid beta-Protein Precursor/genetics , Animals , Humans , Protein Interaction Mapping , Protein Isoforms/genetics
14.
BMC Genomics ; 11 Suppl 5: S4, 2010 Dec 22.
Article in English | MEDLINE | ID: mdl-21210970

ABSTRACT

BACKGROUND: Alternative splicing (AS) is a central mechanism in the generation of genomic complexity and is a major contributor to transcriptome and proteome diversity. Alterations of the splicing process can lead to deregulation of crucial cellular processes and have been associated with a large spectrum of human diseases. Cancer-associated transcripts are potential molecular markers and may contribute to the development of more accurate diagnostic and prognostic methods and also serve as therapeutic targets. Alternative splicing-enriched cDNA libraries have been used to explore the variability generated by alternative splicing. In this study, by combining the use of trapping heteroduplexes and RNA amplification, we developed a powerful approach that enables transcriptome-wide exploration of the AS repertoire for identifying AS variants associated with breast tumor cells modulated by ERBB2 (HER-2/neu) oncogene expression. RESULTS: The human breast cell line (C5.2) and a pool of 5 ERBB2 over-expressing breast tumor samples were used independently for the construction of two AS-enriched libraries. In total, 2,048 partial cDNA sequences were obtained, revealing 214 alternative splicing sequence-enriched tags (ASSETs). A subset with 79 multiple exon ASSETs was compared to public databases and reported 138 different AS events. A high success rate of RT-PCR validation (94.5%) was obtained, and 2 novel AS events were identified. The influence of ERBB2-mediated expression on AS regulation was evaluated by capillary electrophoresis and probe-ligation approaches in two mammary cell lines (Hb4a and C5.2) expressing different levels of ERBB2. The relative expression balance between AS variants from 3 genes was differentially modulated by ERBB2 in this model system. CONCLUSIONS: In this study, we presented a method for exploring AS from any RNA source in a transcriptome-wide format, which can be directly easily adapted to next generation sequencers. We identified AS transcripts that were differently modulated by ERBB2-mediated expression and that can be tested as molecular markers for breast cancer. Such a methodology will be useful for completely deciphering the cancer cell transcriptome diversity resulting from AS and for finding more precise molecular markers.


Subject(s)
Alternative Splicing/genetics , Breast Neoplasms/genetics , Gene Expression Profiling , Gene Library , Genetic Variation , Receptor, ErbB-2/metabolism , Cell Line, Tumor , Cloning, Molecular , Computational Biology , Female , Humans , Oligonucleotides/genetics , Receptor, ErbB-2/genetics , Reverse Transcriptase Polymerase Chain Reaction , Sequence Analysis, DNA
15.
Genomics ; 94(3): 153-60, 2009 Sep.
Article in English | MEDLINE | ID: mdl-19540335

ABSTRACT

Cancer/testis Antigens (CTAs) are immunogenic proteins with a restricted expression pattern in normal tissues and aberrant expression in different types of tumors being considered promising candidates for immunotherapy. We used the alignment between EST sequences and the human genome sequence to identify novel CT genes. By examining the EST tissue composition of known CT clusters we defined parameters for the selection of 1184 EST clusters corresponding to putative CT genes. The expression pattern of 70 CT gene candidates was evaluated by RT-PCR in 21 normal tissues, 17 tumor cell lines and 160 primary tumors. We were able to identify 4 CT genes expressed in different types of tumors. The presence of antibodies against the protein encoded by 1 of these 4 CT genes (FAM46D) was exclusively detected in plasma samples from cancer patients. Due to its restricted expression pattern and immunogenicity FAM46D represents a novel target for cancer immunotherapy.


Subject(s)
Antigens, Neoplasm/immunology , Expressed Sequence Tags , Neoplasm Proteins/immunology , Neoplasms/blood , Antigens, Neoplasm/genetics , Case-Control Studies , Databases, Nucleic Acid , Genome, Human , Humans , Male , Neoplasm Proteins/genetics , Neoplasm Proteins/metabolism , Neoplasms/pathology , Nucleotidyltransferases , Recombinant Proteins/genetics , Recombinant Proteins/immunology , Testis/immunology , Tumor Cells, Cultured
16.
Front Genet ; 11: 548507, 2020.
Article in English | MEDLINE | ID: mdl-33193622

ABSTRACT

Studies on the peopling of South America have been limited by the paucity of sequence data from Native Americans, especially from the east part of the Amazon region. Here, we investigate the whole exome variation from 58 Native American individuals (eight different populations) from the Amazon region and draw insights into the peopling of South America. By using the sequence data generated here together with data from the public domain, we confirmed a strong genetic distinction between Andean and Amazonian populations. By testing distinct demographic models, our analysis supports a scenario of South America occupation that involves migrations along the Pacific and Atlantic coasts. Occupation of the southeast part of South America would involve migrations from the north, rather than from the west of the continent.

17.
BMC Med Genomics ; 13(1): 30, 2020 02 22.
Article in English | MEDLINE | ID: mdl-32087727

ABSTRACT

BACKGROUND: Cancer neoantigens have attracted great interest in immunotherapy due to their capacity to elicit antitumoral responses. These molecules arise from somatic mutations in cancer cells, resulting in alterations on the original protein. Neoantigens identification remains a challenging task due largely to a high rate of false-positives. RESULTS: We have developed an efficient and automated pipeline for the identification of potential neoantigens. neoANT-HILL integrates several immunogenomic analyses to improve neoantigen detection from Next Generation Sequence (NGS) data. The pipeline has been compiled in a pre-built Docker image such that minimal computational background is required for download and setup. NeoANT-HILL was applied in The Cancer Genome Atlas (TCGA) melanoma dataset and found several putative neoantigens including ones derived from the recurrent RAC1:P29S and SERPINB3:E250K mutations. neoANT-HILL was also used to identify potential neoantigens in RNA-Seq data with a high sensitivity and specificity. CONCLUSION: neoANT-HILL is a user-friendly tool with a graphical interface that performs neoantigens prediction efficiently. neoANT-HILL is able to process multiple samples, provides several binding predictors, enables quantification of tumor-infiltrating immune cells and considers RNA-Seq data for identifying potential neoantigens. The software is available through github at https://github.com/neoanthill/neoANT-HILL.


Subject(s)
Antigens, Neoplasm , Databases, Genetic , Melanoma , RNA-Seq , Software , Antigens, Neoplasm/genetics , Antigens, Neoplasm/immunology , Humans , Melanoma/genetics , Melanoma/immunology
18.
Cancer Med ; 9(16): 5948-5959, 2020 08.
Article in English | MEDLINE | ID: mdl-32592321

ABSTRACT

Tumor DNA has been detected in body fluids of cancer patients. Somatic tumor mutations are being used as biomarkers in body fluids to monitor chemotherapy response as a minimally invasive tool. In this study, we evaluated the potential of tracking somatic mutations in free DNA of plasma and urine collected from Wilms tumor (WT) patients for monitoring treatment response. Wilms tumor is a pediatric renal tumor resulting from cell differentiation errors during nephrogenesis. Its mutational repertoire is not completely defined. Thus, for identifying somatic mutations from tumor tissue DNA, we screened matched tumor/leukocyte DNAs using either a panel containing 16 WT-associated genes or whole-exome sequencing (WES). The identified somatic tumor mutations were tracked in urine and plasma DNA collected before, during and after treatment. At least one somatic mutation was identified in five out of six WT tissue samples analyzed. Somatic mutations were detected in body fluids before treatment in all five patients (three patients in urine, three in plasma, and one in both body fluids). In all patients, a decrease of the variant allele fraction of somatic mutations was observed in body fluids during neoadjuvant chemotherapy. Interestingly, the persistence of somatic mutations in body fluids was in accordance with clinical parameters. For one patient who progressed to death, it persisted in high levels in serial body fluid samples during treatment. For three patients without disease progression, somatic mutations were not consistently detected in samples throughout monitoring. For one patient with bilateral disease, a somatic mutation was detected at low levels with no support of clinical manifestation. Our results demonstrated the potential of tracking somatic mutations in urine and plasma DNA as a minimally invasive tool for monitoring WT patients. Additional investigation is needed to check the clinical value of insistent somatic mutations in body fluids.


Subject(s)
DNA, Neoplasm/genetics , Kidney Neoplasms/genetics , Mutation , Wilms Tumor/genetics , Alleles , Chemotherapy, Adjuvant , Child, Preschool , DNA, Neoplasm/blood , DNA, Neoplasm/urine , Female , Humans , Infant , Kidney Neoplasms/blood , Kidney Neoplasms/drug therapy , Kidney Neoplasms/urine , Lung Neoplasms/genetics , Lung Neoplasms/secondary , Neoadjuvant Therapy , Exome Sequencing , Wilms Tumor/blood , Wilms Tumor/drug therapy , Wilms Tumor/urine
19.
Gene ; 726: 144168, 2020 Feb 05.
Article in English | MEDLINE | ID: mdl-31759986

ABSTRACT

Methods based around statistics and linear algebra have been increasingly used in attempts to address emerging questions in microarray literature. Microarray technology is a long-used tool in the global analysis of gene expression, allowing for the simultaneous investigation of hundreds or thousands of genes in a sample. It is characterized by a low sample size and a large feature number created a non-square matrix, and by the incomplete rank, that can generate countless more solution in classifiers. To avoid the problem of the 'curse of dimensionality' many authors have performed feature selection or reduced the size of data matrix. In this work, we introduce a new logistic regression-based model to classify breast cancer tumor samples based on microarray expression data, including all features of gene expression and without reducing the microarray data matrix. If the user still deems it necessary to perform feature reduction, it can be done after the application of the methodology, still maintaining a good classification. This methodology allowed the correct classification of breast cancer sample data sets from Gene Expression Omnibus (GEO) data series GSE65194, GSE20711, and GSE25055, which contain the microarray data of said breast cancer samples. Classification had a minimum performance of 80% (sensitivity and specificity), and explored all possible data combinations, including breast cancer subtypes. This methodology highlighted genes not yet studied in breast cancer, some of which have been observed in Gene Regulatory Networks (GRNs). In this work we examine the patterns and features of a GRN composed of transcription factors (TFs) in MCF-7 breast cancer cell lines, providing valuable information regarding breast cancer. In particular, some genes whose αi ∗ associated parameter values revealed extreme positive and negative values, and, as such, can be identified as breast cancer prediction genes. We indicate that the PKN2, MKL1, MED23, CUL5 and GLI genes demonstrate a tumor suppressor profile, and that the MTR, ITGA2B, TELO2, MRPL9, MTTL1, WIPI1, KLHL20, PI4KB, FOLR1 and SHC1 genes demonstrate an oncogenic profile. We propose that these may serve as potential breast cancer prediction genes, and should be prioritized for further clinical studies on breast cancer. This new model allows for the assignment of values to the αi ∗ parameters associated with gene expression. It was noted that some αi ∗ parameters are associated with genes previously described as breast cancer biomarkers, as well as other genes not yet studied in relation to this disease.


Subject(s)
Breast Neoplasms/genetics , Gene Expression Regulation, Neoplastic/genetics , Gene Regulatory Networks/genetics , Biomarkers, Tumor/genetics , Cell Line, Tumor , Disease Progression , Female , Gene Expression Profiling/methods , Humans , Logistic Models , MCF-7 Cells , Oligonucleotide Array Sequence Analysis/methods , Transcription Factors/genetics
20.
BMC Bioinformatics ; 10: 170, 2009 Jun 06.
Article in English | MEDLINE | ID: mdl-19500384

ABSTRACT

BACKGROUND: High-throughput molecular approaches for gene expression profiling, such as Serial Analysis of Gene Expression (SAGE), Massively Parallel Signature Sequencing (MPSS) or Sequencing-by-Synthesis (SBS) represent powerful techniques that provide global transcription profiles of different cell types through sequencing of short fragments of transcripts, denominated sequence tags. These techniques have improved our understanding about the relationships between these expression profiles and cellular phenotypes. Despite this, more reliable datasets are still necessary. In this work, we present a web-based tool named S3T: Score System for Sequence Tags, to index sequenced tags in accordance with their reliability. This is made through a series of evaluations based on a defined rule set. S3T allows the identification/selection of tags, considered more reliable for further gene expression analysis. RESULTS: This methodology was applied to a public SAGE dataset. In order to compare data before and after filtering, a hierarchical clustering analysis was performed in samples from the same type of tissue, in distinct biological conditions, using these two datasets. Our results provide evidences suggesting that it is possible to find more congruous clusters after using S3T scoring system. CONCLUSION: These results substantiate the proposed application to generate more reliable data. This is a significant contribution for determination of global gene expression profiles. The library analysis with S3T is freely available at http://gdm.fmrp.usp.br/s3t/. S3T source code and datasets can also be downloaded from the aforementioned website.


Subject(s)
Cluster Analysis , Expressed Sequence Tags/chemistry , Gene Expression Profiling/methods , RNA/chemistry , Animals , Colonic Neoplasms/genetics , Colonic Neoplasms/metabolism , Data Interpretation, Statistical , Databases, Genetic , Expressed Sequence Tags/metabolism , Gene Expression , Humans , Mice , RNA/metabolism
SELECTION OF CITATIONS
SEARCH DETAIL