Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 6 de 6
Filtrar
1.
Bioinformatics ; 38(3): 844-845, 2022 01 12.
Artículo en Inglés | MEDLINE | ID: mdl-34664620

RESUMEN

MOTIVATION: Discover is an algorithm developed to identify mutually exclusive genomic events. Its main contribution is a statistical analysis based on the Poisson-Binomial (PB) distribution to take into account the mutation rate of genes and samples. Discover is very effective for identifying mutually exclusive mutations at the expense of speed in large datasets: the PB is computationally costly to estimate, and checking all the potential mutually exclusive alterations requires millions of tests. RESULTS: We have implemented a new version of the package called Rediscover that implements exact and approximate computations of the PB. Rediscover exact implementation is slightly faster than Discover for large and medium-sized datasets. The approximation is 100-1000 times faster for them making it possible to get results in less than a minute with a standard desktop. The memory footprint is also smaller in Rediscover. The new package is available at CRAN and provides some functions to integrate its usage with other R packages such as maftools and TCGAbiolinks. AVAILABILITY AND IMPLEMENTATION: Rediscover is available at CRAN (https://cran.r-project.org/web/packages/Rediscover/index.html). SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Asunto(s)
Genómica , Programas Informáticos , Algoritmos , Genoma , Mutación
2.
Bioinformatics ; 38(6): 1491-1496, 2022 03 04.
Artículo en Inglés | MEDLINE | ID: mdl-34978563

RESUMEN

MOTIVATION: Isoform deconvolution is an NP-hard problem. The accuracy of the proposed solutions is far from perfect. At present, it is not known if gene structure and isoform concentration can be uniquely inferred given paired-end reads, and there is no objective method to select the fragment length to improve the number of identifiable genes. Different pieces of evidence suggest that the optimal fragment length is gene-dependent, stressing the need for a method that selects the fragment length according to a reasonable trade-off across all the genes in the whole genome. RESULTS: A gene is considered to be identifiable if it is possible to get both the structure and concentration of its transcripts univocally. Here, we present a method to state the identifiability of this deconvolution problem. Assuming a given transcriptome and that the coverage is sufficient to interrogate all junction reads of the transcripts, this method states whether or not a gene is identifiable given the read length and fragment length distribution. Applying this method using different read and fragment length combinations, the optimal average fragment length for the human transcriptome is around 400-600 nt for coding genes and 150-200 nt for long non-coding RNAs. The optimal read length is the largest one that fits in the fragment length. It is also discussed the potential profit of combining several libraries to reconstruct the transcriptome. Combining two libraries of very different fragment lengths results in a significant improvement in gene identifiability. AVAILABILITY AND IMPLEMENTATION: Code is available in GitHub (https://github.com/JFerrer-B/transcriptome-identifiability). SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Asunto(s)
Genoma , Transcriptoma , Humanos , RNA-Seq , Biblioteca de Genes , Isoformas de Proteínas/genética , Programas Informáticos
3.
BMC Genomics ; 20(1): 521, 2019 Jun 25.
Artículo en Inglés | MEDLINE | ID: mdl-31238884

RESUMEN

BACKGROUND: Splicing is a genetic process that has important implications in several diseases including cancer. Deciphering the complex rules of splicing regulation is crucial to understand and treat splicing-related diseases. Splicing factors and other RNA-binding proteins (RBPs) play a key role in the regulation of splicing. The specific binding sites of an RBP can be measured using CLIP experiments. However, to unveil which RBPs regulate a condition, it is necessary to have a priori hypotheses, as a single CLIP experiment targets a single protein. RESULTS: In this work, we present a novel methodology to predict context-specific splicing factors from transcriptomic data. For this, we systematically collect, integrate and analyze more than 900 CLIP experiments stored in four CLIP databases: POSTAR2, CLIPdb, DoRiNA and StarBase. The analysis of these experiments shows the strong coherence between the binding sites of RBPs of similar families. Augmenting this information with expression changes, we are able to correctly predict the splicing factors that regulate splicing in two gold-standard experiments in which specific splicing factors are knocked-down. CONCLUSIONS: The methodology presented in this study allows the prediction of active splicing factors in either cancer or any other condition by only using the information of transcript expression. This approach opens a wide range of possible studies to understand the splicing regulation of different conditions. A tutorial with the source code and databases is available at https://gitlab.com/fcarazo.m/sfprediction .


Asunto(s)
Biología Computacional/métodos , Perfilación de la Expresión Génica , Inmunoprecipitación , Factores de Empalme de ARN/metabolismo , Proteínas de Unión al ARN/metabolismo , Animales , Bases de Datos Genéticas , Secuenciación de Nucleótidos de Alto Rendimiento , Humanos , Ratones , Factores de Empalme de ARN/química
4.
Arch Bronconeumol ; 2024 Jun 04.
Artículo en Inglés, Español | MEDLINE | ID: mdl-38971669

RESUMEN

INTRODUCTION: Trisegmentectomy, or resection of the upper subdivision of the left upper lobe with preservation of the lingula, is considered by some authors to be equivalent to right upper lobectomy with middle lobe preservation. Our objective was to compare survival and recurrence after trisegmentectomy versus left upper lobectomy procedures registered in the Spanish Video-Assisted Thoracic Surgery group (GEVATS) database. METHODS: We compared mortality, survival and recurrence in patients with left upper lobectomy or trisegmentectomy after propensity score matching for the following variables: age, smoking habit, tumor size, histologic type, radiological density of tumor, surgical access, forced expiratory volume in one second, diffusing capacity of the lungs for carbon monoxide, hypertension, chronic heart failure, ischemic heart disease, arrhythmia, stroke, peripheral vascular disease, diabetes and pre-surgery nodal status by positron emission tomography/computed tomography. RESULTS: A total of 540 left upper lobectomies and 83 trisegmentectomies were registered in the GEVATS database. After propensity score matching, 134 left upper lobectomies and 67 trisegmentectomies were selected. Survival outcomes were similar, but differences were found for recurrence (21.5% for trisegmentectomies vs. 35.4% for left upper lobectomies, p=0.05). Moreover, the recurrence patterns differed, with the lobectomy group showing a greater tendency to distant dissemination. CONCLUSIONS: Trisegmentectomy and left upper lobectomy show similar 5-year survival rates. In our database, recurrence after trisegmentectomy was lower than after left upper lobectomy, while the recurrence pattern differed among the 2 surgical approaches, with a greater tendency to distant metastasis after left upper lobectomy.

5.
NAR Genom Bioinform ; 4(3): lqac067, 2022 Sep.
Artículo en Inglés | MEDLINE | ID: mdl-36128425

RESUMEN

Alternative splicing (AS) plays a key role in cancer: all its hallmarks have been associated with different mechanisms of abnormal AS. The improvement of the human transcriptome annotation and the availability of fast and accurate software to estimate isoform concentrations has boosted the analysis of transcriptome profiling from RNA-seq. The statistical analysis of AS is a challenging problem not yet fully solved. We have included in EventPointer (EP), a Bioconductor package, a novel statistical method that can use the bootstrap of the pseudoaligners. We compared it with other state-of-the-art algorithms to analyze AS. Its performance is outstanding for shallow sequencing conditions. The statistical framework is very flexible since it is based on design and contrast matrices. EP now includes a convenient tool to find the primers to validate the discoveries using PCR. We also added a statistical module to study alteration in protein domain related to AS. Applying it to 9514 patients from TCGA and TARGET in 19 different tumor types resulted in two conclusions: i) aberrant alternative splicing alters the relative presence of Protein domains and, ii) the number of enriched domains is strongly correlated with the age of the patients.

6.
Sci Rep ; 10(1): 1069, 2020 01 23.
Artículo en Inglés | MEDLINE | ID: mdl-31974522

RESUMEN

The advent of RNA-seq technologies has switched the paradigm of genetic analysis from a genome to a transcriptome-based perspective. Alternative splicing generates functional diversity in genes, but the precise functions of many individual isoforms are yet to be elucidated. Gene Ontology was developed to annotate gene products according to their biological processes, molecular functions and cellular components. Despite a single gene may have several gene products, most annotations are not isoform-specific and do not distinguish the functions of the different proteins originated from a single gene. Several approaches have tried to automatically annotate ontologies at the isoform level, but this has shown to be a daunting task. We have developed ISOGO (ISOform + GO function imputation), a novel algorithm to predict the function of coding isoforms based on their protein domains and their correlation of expression along 11,373 cancer patients. Combining these two sources of information outperforms previous approaches: it provides an area under precision-recall curve (AUPRC) five times larger than previous attempts and the median AUROC of assigned functions to genes is 0.82. We tested ISOGO predictions on some genes with isoform-specific functions (BRCA1, MADD,VAMP7 and ITSN1) and they were coherent with the literature. Besides, we examined whether the main isoform of each gene -as predicted by APPRIS- was the most likely to have the annotated gene functions and it occurs in 99.4% of the genes. We also evaluated the predictions for isoform-specific functions provided by the CAFA3 challenge and results were also convincing. To make these results available to the scientific community, we have deployed a web application to consult ISOGO predictions (https://biotecnun.unav.es/app/isogo). Initial data, website link, isoform-specific GO function predictions and R code is available at https://gitlab.com/icassol/isogo.


Asunto(s)
Algoritmos , Anotación de Secuencia Molecular/métodos , Isoformas de Proteínas/genética , Empalme Alternativo , Ontología de Genes , Humanos , Sistemas de Lectura Abierta
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA