Búsqueda | Portal Regional de la BVS

Single-cell transcriptional landscape of long non-coding RNAs orchestrating mouse heart development.

Ramos, Thaís A R; Urquiza-Zurich, Sebastián; Kim, Soo Young; Gillette, Thomas G; Hill, Joseph A; Lavandero, Sergio; do Rêgo, Thaís G; Maracaja-Coutinho, Vinicius.

Cell Death Dis ; 14(12): 841, 2023 12 18.

Artículo en Inglés | MEDLINE | ID: mdl-38110334

RESUMEN

Long non-coding RNAs (lncRNAs) comprise the most representative transcriptional units of the mammalian genome. They are associated with organ development linked with the emergence of cardiovascular diseases. We used bioinformatic approaches, machine learning algorithms, systems biology analyses, and statistical techniques to define co-expression modules linked to heart development and cardiovascular diseases. We also uncovered differentially expressed transcripts in subpopulations of cardiomyocytes. Finally, from this work, we were able to identify eight cardiac cell-types; several new coding, lncRNA, and pcRNA markers; two cardiomyocyte subpopulations at four different time points (ventricle E9.5, left ventricle E11.5, right ventricle E14.5 and left atrium P0) that harbored co-expressed gene modules enriched in mitochondrial, heart development and cardiovascular diseases. Our results evidence the role of particular lncRNAs in heart development and highlight the usage of co-expression modular approaches in the cell-type functional definition.

Asunto(s)

Enfermedades Cardiovasculares , ARN Largo no Codificante , Animales , Ratones , ARN Largo no Codificante/genética , Perfilación de la Expresión Génica/métodos , Organogénesis , Miocitos Cardíacos , Mamíferos/genética

Novel Perlin-based Phantoms Using 3D Models of Compressed Breast Shape and Fractal Noise.

Teixeira, João P V; Silva Filho, Telmo M; do Rêgo, Thaís G; Malheiros, Yuri B; Dustler, Magnus; Bakic, Predrag R; Vent, Trevor L; Acciavatti, Raymond J; Krishnamoorthy, Srilalan; Surti, Suleman; Maidment, Andrew D A; Barufaldi, Bruno.

Proc SPIE Int Soc Opt Eng ; 120312022.

Artículo en Inglés | MEDLINE | ID: mdl-39351016

RESUMEN

Virtual clinical trials (VCTs) have been used widely to evaluate digital breast tomosynthesis (DBT) systems. VCTs require realistic simulations of the breast anatomy (phantoms) to characterize lesions and to estimate risk of masking cancers. This study introduces the use of Perlin-based phantoms to optimize the acquisition geometry of a novel DBT prototype. These phantoms were developed using a GPU implementation of a novel library called Perlin-CuPy. The breast anatomy is simulated using 3D models under mammography cranio-caudal compression. In total, 240 phantoms were created using compressed breast thickness, chest-wall to nipple distance, and skin thickness that varied in a {[35, 75], [59, 130), [1.0, 2.0]} mm interval, respectively. DBT projections and reconstructions of the phantoms were simulated using two acquisition geometries of our DBT prototype. The performance of both acquisition geometries was compared using breast volume segmentations of the Perlin phantoms. Results show that breast volume estimates are improved with the introduction of posterior-anterior motion of the x-ray source in DBT acquisitions. The breast volume is overestimated in DBT, varying substantially with the acquisition geometry; segmentation errors are more evident for thicker and larger breasts. These results provide additional evidence and suggest that custom acquisition geometries can improve the performance and accuracy in DBT. Perlin phantoms help to identify limitations in acquisition geometries and to optimize the performance of the DBT prototypes.

Multiclass Segmentation of Suspicious Findings in Simulated Breast Tomosynthesis Images Using a U-Net.

da Nobrega, Yann N G; Carvalhal, Giulia; Teixeira, João P V; de Camargo, Barbara P; do Rego, Thais G; Malheiros, Yuri; Silva Filho, Telmo de M E; Vent, Trevor L; Acciavatti, Raymond J; Maidment, Andrew D A; Barufaldi, Bruno.

Proc SPIE Int Soc Opt Eng ; 122862022 May.

Artículo en Inglés | MEDLINE | ID: mdl-39183730

RESUMEN

Our lab has built a next-generation tomosynthesis (NGT) system utilizing scanning motions with more degrees of freedom than clinical digital breast tomosynthesis systems. We are working toward designing scanning motions that are customized around the locations of suspicious findings. The first step in this direction is to demonstrate that these findings can be detected with a single projection image, which can guide the remainder of the scan. This paper develops an automated method to identify findings that are prone to be masked. Perlin-noise phantoms and synthetic lesions were used to simulate masked cancers. NGT projections of phantoms were simulated using ray-tracing software. The risk of masking cancers was mapped using the ground-truth labels of phantoms. The phantom labels were used to denote regions of low and high risk of masking suspicious findings. A U-Net model was trained for multiclass segmentation of phantom images. Model performance was quantified with a receiver operating characteristic (ROC) curve using area under the curve (AUC). The ROC operating point was defined to be the point closest to the upper left corner of ROC space. The output predictions showed an accurate segmentation of tissue predominantly adipose (mean AUC of 0.93). The predictions also indicate regions of suspicious findings; for the highest risk class, mean AUC was 0.89, with a true positive rate of 0.80 and a true negative rate of 0.83 at the operating point. In summary, this paper demonstrates with virtual phantoms that a single projection can indeed be used to identify suspicious findings.

RNAmining: A machine learning stand-alone and web server tool for RNA coding potential prediction.

Ramos, Thaís A R; Galindo, Nilbson R O; Arias-Carrasco, Raúl; da Silva, Cecília F; Maracaja-Coutinho, Vinicius; do Rêgo, Thaís G.

F1000Res ; 10: 323, 2021.

Artículo en Inglés | MEDLINE | ID: mdl-34164114

RESUMEN

Non-coding RNAs (ncRNAs) are important players in the cellular regulation of organisms from different kingdoms. One of the key steps in ncRNAs research is the ability to distinguish coding/non-coding sequences. We applied seven machine learning algorithms (Naive Bayes, SVM, KNN, Random Forest, XGBoost, ANN and DL) through 15 model organisms from different evolutionary branches. Then, we created a stand-alone and web server tool (RNAmining) to distinguish coding and non-coding sequences, selecting the algorithm with the best performance (XGBoost). Firstly, we used coding/non-coding sequences downloaded from Ensembl (April 14th, 2020). Then, coding/non-coding sequences were balanced, had their tri-nucleotides counts analysed and we performed a normalization by the sequence length. Thus, in total we built 180 models. All the machine learning algorithms tests were performed using 10-folds cross-validation and we selected the algorithm with the best results (XGBoost) to implement at RNAmining. Best F1-scores ranged from 97.56% to 99.57% depending on the organism. Moreover, we produced a benchmarking with other tools already in literature (CPAT, CPC2, RNAcon and Transdecoder) and our results outperformed them, opening opportunities for the development of RNAmining, which is freely available at https://rnamining.integrativebioinformatics.me/.

Asunto(s)

Aprendizaje Automático , ARN , Algoritmos , Teorema de Bayes , Máquina de Vectores de Soporte

CORAZON: a web server for data normalization and unsupervised clustering based on expression profiles.

Ramos, Thaís A R; Maracaja-Coutinho, Vinicius; Ortega, J Miguel; do Rêgo, Thaís G.

BMC Res Notes ; 13(1): 338, 2020 Jul 14.

Artículo en Inglés | MEDLINE | ID: mdl-32665017

RESUMEN

OBJECTIVE: Data normalization and clustering are mandatory steps in gene expression and downstream analyses, respectively. However, user-friendly implementations of these methodologies are available exclusively under expensive licensing agreements, or in stand-alone scripts developed, reflecting on a great obstacle for users with less computational skills. RESULTS: We developed an online tool called CORAZON (Correlations Analyses Zipper Online), which implements three unsupervised learning methods to cluster gene expression datasets in a friendly environment. It allows the usage of eight gene expression normalization/transformation methodologies and the attribute's influence. The normalizations requiring the gene length only could be performed to RNA-seq, meanwhile the others can be used with microarray and/or NanoString data. Clustering methodologies performances were evaluated through five models with accuracies between 92 and 100%. We applied our tool to obtain functional insights of non-coding RNAs (ncRNAs) based on Gene Ontology enrichment of clusters in a dataset generated by the ENCODE project. The clusters where the majority of transcripts are coding genes were enriched in Cellular, Metabolic, Transports, and Systems Development categories. Meanwhile, the ncRNAs were enriched in the Detection of Stimulus, Sensory Perception, Immunological System, and Digestion categories. CORAZON source-code is freely available at https://gitlab.com/integrativebioinformatics/corazon and the web-server can be accessed at http://corazon.integrativebioinformatics.me .

Asunto(s)

Computadores , Programas Informáticos , Análisis por Conglomerados , Perfilación de la Expresión Génica , Ontología de Genes , Internet , ARN no Traducido

A Novel Method to Predict Genomic Islands Based on Mean Shift Clustering Algorithm.

de Brito, Daniel M; Maracaja-Coutinho, Vinicius; de Farias, Savio T; Batista, Leonardo V; do Rêgo, Thaís G.

PLoS One ; 11(1): e0146352, 2016.

Artículo en Inglés | MEDLINE | ID: mdl-26731657

RESUMEN

Genomic Islands (GIs) are regions of bacterial genomes that are acquired from other organisms by the phenomenon of horizontal transfer. These regions are often responsible for many important acquired adaptations of the bacteria, with great impact on their evolution and behavior. Nevertheless, these adaptations are usually associated with pathogenicity, antibiotic resistance, degradation and metabolism. Identification of such regions is of medical and industrial interest. For this reason, different approaches for genomic islands prediction have been proposed. However, none of them are capable of predicting precisely the complete repertory of GIs in a genome. The difficulties arise due to the changes in performance of different algorithms in the face of the variety of nucleotide distribution in different species. In this paper, we present a novel method to predict GIs that is built upon mean shift clustering algorithm. It does not require any information regarding the number of clusters, and the bandwidth parameter is automatically calculated based on a heuristic approach. The method was implemented in a new user-friendly tool named MSGIP--Mean Shift Genomic Island Predictor. Genomes of bacteria with GIs discussed in other papers were used to evaluate the proposed method. The application of this tool revealed the same GIs predicted by other methods and also different novel unpredicted islands. A detailed investigation of the different features related to typical GI elements inserted in these new regions confirmed its effectiveness. Stand-alone and user-friendly versions for this new methodology are available at http://msgip.integrativebioinformatics.me.

Asunto(s)

Genoma Bacteriano , Islas Genómicas , Genómica/métodos , Algoritmos , Análisis por Conglomerados

Evolution of transfer RNA and the origin of the translation system.

de Farias, Savio T; do Rêgo, Thaís G; José, Marco V.

Front Genet ; 5: 303, 2014.

Artículo en Inglés | MEDLINE | ID: mdl-25221573

Inferring epigenetic and transcriptional regulation during blood cell development with a mixture of sparse linear models.

do Rego, Thais G; Roider, Helge G; de Carvalho, Francisco A T; Costa, Ivan G.

Bioinformatics ; 28(18): 2297-303, 2012 Sep 15.

Artículo en Inglés | MEDLINE | ID: mdl-22730432

RESUMEN

MOTIVATION: Blood cell development is thought to be controlled by a circuit of transcription factors (TFs) and chromatin modifications that determine the cell fate through activating cell type-specific expression programs. To shed light on the interplay between histone marks and TFs during blood cell development, we model gene expression from regulatory signals by means of combinations of sparse linear regression models. RESULTS: The mixture of sparse linear regression models was able to improve the gene expression prediction in relation to the use of a single linear model. Moreover, it performed an efficient selection of regulatory signals even when analyzing all TFs with known motifs (>600). The method identified interesting roles for histone modifications and a selection of TFs related to blood development and chromatin remodelling. AVAILABILITY: The method and datasets are available from http://www.cin.ufpe.br/~igcf/SparseMix. CONTACT: igcf@cin.ufpe.br SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

Asunto(s)

Células Sanguíneas/metabolismo , Epigénesis Genética , Transcripción Genética , Animales , Teorema de Bayes , Sitios de Unión , Diferenciación Celular/genética , Células Madre Embrionarias/metabolismo , Histonas/metabolismo , Modelos Lineales , Ratones , Regiones Promotoras Genéticas , Factores de Transcripción/metabolismo

Predicting gene expression in T cell differentiation from histone modifications and transcription factor binding affinities by linear mixture models.

Costa, Ivan G; Roider, Helge G; do Rego, Thais G; de Carvalho, Francisco de A T.

BMC Bioinformatics ; 12 Suppl 1: S29, 2011 Feb 15.

Artículo en Inglés | MEDLINE | ID: mdl-21342559

RESUMEN

BACKGROUND: The differentiation process from stem cells to fully differentiated cell types is controlled by the interplay of chromatin modifications and transcription factor activity. Histone modifications or transcription factors frequently act in a multi-functional manner, with a given DNA motif or histone modification conveying both transcriptional repression and activation depending on its location in the promoter and other regulatory signals surrounding it. RESULTS: To account for the possible multi functionality of regulatory signals, we model the observed gene expression patterns by a mixture of linear regression models. We apply the approach to identify the underlying histone modifications and transcription factors guiding gene expression of differentiated CD4+ T cells. The method improves the gene expression prediction in relation to the use of a single linear model, as often used by previous approaches. Moreover, it recovered the known role of the modifications H3K4me3 and H3K27me3 in activating cell specific genes and of some transcription factors related to CD4+ T differentiation.

Asunto(s)

Linfocitos T CD4-Positivos/citología , Diferenciación Celular , Histonas/metabolismo , Factores de Transcripción/metabolismo , Teorema de Bayes , Linfocitos T CD4-Positivos/metabolismo , ADN/genética , ADN/metabolismo , Regulación de la Expresión Génica , Histonas/genética , Modelos Lineales , Unión Proteica , Factores de Transcripción/genética

RESUMEN

Asunto(s)

RESUMEN

RESUMEN

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

ENVIAR RESULTADO:

SELECCIÓN DE REFERENCIAS

DETALLE DE LA BÚSQUEDA