Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 15 de 15
Filtrar
1.
Genomics ; 113(3): 1308-1324, 2021 05.
Artículo en Inglés | MEDLINE | ID: mdl-33662531

RESUMEN

Single-cell RNA sequencing (scRNA-seq) is a powerful technology that is capable of generating gene expression data at the resolution of individual cell. The scRNA-seq data is characterized by the presence of dropout events, which severely bias the results if they remain unaddressed. There are limited Differential Expression (DE) approaches which consider the biological processes, which lead to dropout events, in the modeling process. So, we develop, SwarnSeq, an improved method for DE, and other downstream analysis that considers the molecular capture process in scRNA-seq data modeling. The performance of the proposed method is benchmarked with 11 existing methods on 10 different real scRNA-seq datasets under three comparison settings. We demonstrate that SwarnSeq method has improved performance over the 11 existing methods. This improvement is consistently observed across several public scRNA-seq datasets generated using different scRNA-seq protocols. The external spike-ins data can be used in the SwarnSeq method to enhance its performance. AVAILABILITY AND IMPLEMENTATION: The method is implemented as a publicly available R package available at https://github.com/sam-uofl/SwarnSeq.


Asunto(s)
Perfilación de la Expresión Génica , Análisis de la Célula Individual , RNA-Seq , Análisis de Secuencia de ARN/métodos , Programas Informáticos
2.
Entropy (Basel) ; 24(7)2022 Jul 18.
Artículo en Inglés | MEDLINE | ID: mdl-35885218

RESUMEN

With the advent of single-cell RNA-sequencing (scRNA-seq), it is possible to measure the expression dynamics of genes at the single-cell level. Through scRNA-seq, a huge amount of expression data for several thousand(s) of genes over million(s) of cells are generated in a single experiment. Differential expression analysis is the primary downstream analysis of such data to identify gene markers for cell type detection and also provide inputs to other secondary analyses. Many statistical approaches for differential expression analysis have been reported in the literature. Therefore, we critically discuss the underlying statistical principles of the approaches and distinctly divide them into six major classes, i.e., generalized linear, generalized additive, Hurdle, mixture models, two-class parametric, and non-parametric approaches. We also succinctly discuss the limitations that are specific to each class of approaches, and how they are addressed by other subsequent classes of approach. A number of challenges are identified in this study that must be addressed to develop the next class of innovative approaches. Furthermore, we also emphasize the methodological challenges involved in differential expression analysis of scRNA-seq data that researchers must address to draw maximum benefit from this recent single-cell technology. This study will serve as a guide to genome researchers and experimental biologists to objectively select options for their analysis.

3.
Entropy (Basel) ; 23(8)2021 Jul 23.
Artículo en Inglés | MEDLINE | ID: mdl-34441085

RESUMEN

Genome-wide expression study is a powerful genomic technology to quantify expression dynamics of genes in a genome. In gene expression study, gene set analysis has become the first choice to gain insights into the underlying biology of diseases or stresses in plants. It also reduces the complexity of statistical analysis and enhances the explanatory power of the obtained results from the primary downstream differential expression analysis. The gene set analysis approaches are well developed in microarrays and RNA-seq gene expression data analysis. These approaches mainly focus on analyzing the gene sets with gene ontology or pathway annotation data. However, in plant biology, such methods may not establish any formal relationship between the genotypes and the phenotypes, as most of the traits are quantitative and controlled by polygenes. The existing Quantitative Trait Loci (QTL)-based gene set analysis approaches only focus on the over-representation analysis of the selected genes while ignoring their associated gene scores. Therefore, we developed an innovative statistical approach, GSQSeq, to analyze the gene sets with trait enriched QTL data. This approach considers the associated differential expression scores of genes while analyzing the gene sets. The performance of the developed method was tested on five different crop gene expression datasets obtained from real crop gene expression studies. Our analytical results indicated that the trait-specific analysis of gene sets was more robust and successful through the proposed approach than existing techniques. Further, the developed method provides a valuable platform for integrating the gene expression data with QTL data.

4.
Entropy (Basel) ; 22(11)2020 Oct 25.
Artículo en Inglés | MEDLINE | ID: mdl-33286973

RESUMEN

Selection of biologically relevant genes from high-dimensional expression data is a key research problem in gene expression genomics. Most of the available gene selection methods are either based on relevancy or redundancy measure, which are usually adjudged through post selection classification accuracy. Through these methods the ranking of genes was conducted on a single high-dimensional expression data, which led to the selection of spuriously associated and redundant genes. Hence, we developed a statistical approach through combining a support vector machine with Maximum Relevance and Minimum Redundancy under a sound statistical setup for the selection of biologically relevant genes. Here, the genes were selected through statistical significance values and computed using a nonparametric test statistic under a bootstrap-based subject sampling model. Further, a systematic and rigorous evaluation of the proposed approach with nine existing competitive methods was carried on six different real crop gene expression datasets. This performance analysis was carried out under three comparison settings, i.e., subject classification, biological relevant criteria based on quantitative trait loci and gene ontology. Our analytical results showed that the proposed approach selects genes which are more biologically relevant as compared to the existing methods. Moreover, the proposed approach was also found to be better with respect to the competitive existing methods. The proposed statistical approach provides a framework for combining filter and wrapper methods of gene selection.

5.
Entropy (Basel) ; 22(4)2020 Apr 10.
Artículo en Inglés | MEDLINE | ID: mdl-33286201

RESUMEN

Over the last decade, gene set analysis has become the first choice for gaining insights into underlying complex biology of diseases through gene expression and gene association studies. It also reduces the complexity of statistical analysis and enhances the explanatory power of the obtained results. Although gene set analysis approaches are extensively used in gene expression and genome wide association data analysis, the statistical structure and steps common to these approaches have not yet been comprehensively discussed, which limits their utility. In this article, we provide a comprehensive overview, statistical structure and steps of gene set analysis approaches used for microarrays, RNA-sequencing and genome wide association data analysis. Further, we also classify the gene set analysis approaches and tools by the type of genomic study, null hypothesis, sampling model and nature of the test statistic, etc. Rather than reviewing the gene set analysis approaches individually, we provide the generation-wise evolution of such approaches for microarrays, RNA-sequencing and genome wide association studies and discuss their relative merits and limitations. Here, we identify the key biological and statistical challenges in current gene set analysis, which will be addressed by statisticians and biologists collectively in order to develop the next generation of gene set analysis approaches. Further, this study will serve as a catalog and provide guidelines to genome researchers and experimental biologists for choosing the proper gene set analysis approach based on several factors.

6.
Chemosphere ; 337: 139128, 2023 Oct.
Artículo en Inglés | MEDLINE | ID: mdl-37315855

RESUMEN

The present study has been carried out to see the long-term effects of triflumezopyrim in an Indian major carp, Labeo rohita. Fishes were exposed to sub-lethal concentrations triflumezopyrim insecticide, 1.41 ppm (Treatment 1), 3.27 ppm (Treatment 2) and 4.97 ppm (Treatment 3), respectively for 21 days. The liver, kidney, gills, muscle, and brain tissues of the fish were examined for physiological parameters and biochemical parameters such as catalase (CAT), superoxide dismutase (SOD), lactate dehydrogenase (LDH), malate dehydrogenase (MDH), alanine aminotransaminase (ALT), aspartate aminotransaminase (AST), acetylcholinessterase (AChE), and hexokinase. After 21 days of exposure, the activity CAT, SOD, LDH, MDH and ALT got increased and a drop in the activity of total protein was found in all treatment groups in comparison to the control group. Long-term triflumezopyrim exposure increased ROS production, ultimately leading to oxidative cell damage and inhibiting the antioxidant capabilities of the fish tissues. Histopathological analysis showed alteration in different tissues structures of pesticide treated fishes. Fishes exposed to highest sublethal concentration of the pesticide showed higher damage rate. The present study demonstrated that chronic exposure of fish to different sublethal concentration of triflumezopyrim exerts detrimental effect on the organism.


Asunto(s)
Cyprinidae , Insecticidas , Contaminantes Químicos del Agua , Animales , Insecticidas/farmacología , Cyprinidae/metabolismo , Antioxidantes/metabolismo , Superóxido Dismutasa/metabolismo , Agua Dulce , Hígado/metabolismo , Branquias/metabolismo , Contaminantes Químicos del Agua/metabolismo
7.
Sci Rep ; 13(1): 22583, 2023 12 19.
Artículo en Inglés | MEDLINE | ID: mdl-38114542

RESUMEN

Foot-and-mouth disease (FMD) is a severe contagious viral disease of cloven-hoofed animals. In India, a vaccination-based official FMD control programme was started, which got expanded progressively to cover entire country in 2019. The serological tests are used to determine non-structural protein based sero-prevalence rates for properly implementing and assessing the control programme. Since 2008, reporting of the FMD sero-surveillance was limited to the serum sample-based serological test results without going for population-level estimation due to lack of proper statistical methodology. Thus, we present a computational approach for estimating the sero-prevalence rates at the state and national levels. Based on the reported approach, a web-application ( https://nifmd-bbf.icar.gov.in/FMDSeroSurv ) and an R software package ( https://github.com/sam-dfmd/FMDSeroSurv ) have been developed. The presented computational techniques are applied to the FMD sero-surveillance data during 2008-2021 to get the status of virus circulation in India under a strict vaccination policy. Furthermore, through various structural equation models, we attempt to establish a link between India's estimated sero-prevalence rate and field FMD outbreaks. Our results indicate that the current sero-prevalence rates are significantly associated with previous field outbreaks up to 2 years. Besides, we observe downward trends in sero-prevalence and outbreaks over the years, specifically after 2013, which indicate the effectiveness of various measures implemented under the FMD control programme. The findings of the study may help researchers and policymakers to track virus infection and identification of potential disease-free zones through vaccination.


Asunto(s)
Enfermedades de los Bovinos , Virus de la Fiebre Aftosa , Fiebre Aftosa , Bovinos , Animales , Prevalencia , Anticuerpos Antivirales , Enfermedades de los Bovinos/epidemiología , Enfermedades de los Bovinos/prevención & control , Fiebre Aftosa/epidemiología , Fiebre Aftosa/prevención & control , Brotes de Enfermedades/veterinaria , India/epidemiología
8.
PLoS One ; 17(11): e0277431, 2022.
Artículo en Inglés | MEDLINE | ID: mdl-36449484

RESUMEN

Early detection of lung cancer is a crucial factor for increasing its survival rates among the detected patients. The presence of carbonyl volatile organic compounds (VOCs) in exhaled breath can play a vital role in early detection of lung cancer. Identifying these VOC markers in breath samples through innovative statistical and machine learning techniques is an important task in lung cancer research. Therefore, we proposed an experimental approach for generation of VOC molecular concentration data using unique silicon microreactor technology and further identification and characterization of key relevant VOCs important for lung cancer detection through statistical and machine learning algorithms. We reported several informative VOCs and tested their effectiveness in multi-group classification of patients. Our analytical results indicated that seven key VOCs, including C4H8O2, C13H22O, C11H22O, C2H4O2, C7H14O, C6H12O, and C5H8O, are sufficient to detect the lung cancer patients with higher mean classification accuracy (92%) and lower standard error (0.03) compared to other combinations. In other words, the molecular concentrations of these VOCs in exhaled breath samples were able to discriminate the patients with lung cancer (n = 156) from the healthy smoker and nonsmoker controls (n = 193) and patients with benign pulmonary nodules (n = 65). The quantification of carbonyl VOC profiles from breath samples and identification of crucial VOCs through our experimental approach paves the way forward for non-invasive lung cancer detection. Further, our experimental and analytical approach of VOC quantitative analysis in breath samples may be extended to other diseases, including COVID-19 detection.


Asunto(s)
Líquidos Corporales , COVID-19 , Neoplasias Pulmonares , Nódulos Pulmonares Múltiples , Compuestos Orgánicos Volátiles , Humanos , Neoplasias Pulmonares/diagnóstico
9.
MethodsX ; 8: 101580, 2021.
Artículo en Inglés | MEDLINE | ID: mdl-35004214

RESUMEN

Single-cell RNA-sequencing (scRNA-seq) is a recent high-throughput genomic technology used to study the expression dynamics of genes at single-cell level. Analyzing the scRNA-seq data in presence of biological confounding factors including dropout events is a challenging task. Thus, this article presents a novel statistical approach for various analyses of the scRNA-seq Unique Molecular Identifier (UMI) counts data. The various analyses include modeling and fitting of observed UMI data, cell type detection, estimation of cell capture rates, estimation of gene specific model parameters, estimation of the sample mean and sample variance of the genes, etc. Besides, the developed approach is able to perform differential expression, and other downstream analyses that consider the molecular capture process in scRNA-seq data modeling. Here, the external spike-ins data can also be used in the approach for better results. The unique feature of the method is that it considers the biological process that leads to severe dropout events in modeling the observed UMI counts of genes. • The differential expression analysis of observed scRNA-seq UMI counts data is performed after adjustment for cell capture rates. • The statistical approach performs downstream differential zero inflation analysis, classification of influential genes, and selection of top marker genes. • Cell auxiliaries including cell clusters and other cell variables (e.g., cell cycle, cell phase) are used to remove unwanted variation to perform statistical tests reliably.

10.
Genes (Basel) ; 12(12)2021 12 02.
Artículo en Inglés | MEDLINE | ID: mdl-34946896

RESUMEN

Single-cell RNA-sequencing (scRNA-seq) is a recent high-throughput sequencing technique for studying gene expressions at the cell level. Differential Expression (DE) analysis is a major downstream analysis of scRNA-seq data. DE analysis the in presence of noises from different sources remains a key challenge in scRNA-seq. Earlier practices for addressing this involved borrowing methods from bulk RNA-seq, which are based on non-zero differences in average expressions of genes across cell populations. Later, several methods specifically designed for scRNA-seq were developed. To provide guidance on choosing an appropriate tool or developing a new one, it is necessary to comprehensively study the performance of DE analysis methods. Here, we provide a review and classification of different DE approaches adapted from bulk RNA-seq practice as well as those specifically designed for scRNA-seq. We also evaluate the performance of 19 widely used methods in terms of 13 performance metrics on 11 real scRNA-seq datasets. Our findings suggest that some bulk RNA-seq methods are quite competitive with the single-cell methods and their performance depends on the underlying models, DE test statistic(s), and data characteristics. Further, it is difficult to obtain the method which will be best-performing globally through individual performance criterion. However, the multi-criteria and combined-data analysis indicates that DECENT and EBSeq are the best options for DE analysis. The results also reveal the similarities among the tested methods in terms of detecting common DE genes. Our evaluation provides proper guidelines for selecting the proper tool which performs best under particular experimental settings in the context of the scRNA-seq.


Asunto(s)
Perfilación de la Expresión Génica/métodos , RNA-Seq/métodos , Análisis de Secuencia de ARN/métodos , Análisis de la Célula Individual/métodos , Programas Informáticos/estadística & datos numéricos , Algoritmos , Animales , Bases de Datos de Ácidos Nucleicos , Humanos , Ratones , Análisis de Secuencia de ARN/estadística & datos numéricos , Análisis de la Célula Individual/estadística & datos numéricos
11.
J Family Med Prim Care ; 9(7): 3619-3622, 2020 Jul.
Artículo en Inglés | MEDLINE | ID: mdl-33102339

RESUMEN

BACKGROUND: Intravenous (IV) iron sucrose is claimed to have better safety profile and efficacy in treatment of iron deficiency anemia than conventional oral iron supplements. AIM: The aim of the study was to compare the efficacy and safety of IV iron therapy with oral iron supplements in iron deficiency anemia. METHODS: An observational study was carried out by allocating 100 patients with baseline hemoglobin between 5 and 10 g/dL into two groups of oral iron and IV iron group. Hemoglobin and serum ferritin levels were measured at admission, on day 14 and on day 28. Adverse effect profile for each group was tabulated. Mean and standard deviation were calculated for each group and compared. RESULTS: A total of 100 patients participated consisting of 37 males and 63 females. Baseline hemoglobin and serum ferritin for both groups were comparable. After initiation of therapy, hemoglobin in oral iron group raised from 6.45 (0.72) to 8.84 (0.47) on day 14 and to 9.69 (0.47) on day 28. Hemoglobin in IV iron group increased from 6.34 (0.86) to 10.52 (0.61) on day 14 and to 11.66 (0.84) on day 28. Serum ferritin in oral iron group increased from 8.3 (1.9) to 33.8 (1.29) on day 14 and to 43.61 (8.8) on day 28. Serum ferritin in IV iron group raised from 8.23 (4.64) to 148.23 (11.86) on day 14 but decreased to 115.76 (15.3) on day 28. The data were statistically significant for IV iron therapy on day 14 and day 28. Of 100 patients, 18 patients (12 in oral and 6 in IV iron groups) had adverse effects. Among the oral iron group, metallic taste and constipation were major side effects followed by heart burn and nausea. In the IV iron group, arthralgia (4 patients of 6) was the major side effect observed. One patient (of 6) in IV group had hypotension. Anaphylaxis was not observed in any patient in either group. CONCLUSION: IV iron therapy is effective and safe for management of iron deficiency anemia.

12.
Gene ; 655: 71-83, 2018 May 20.
Artículo en Inglés | MEDLINE | ID: mdl-29458166

RESUMEN

Selection of informative genes from high dimensional gene expression data has emerged as an important research area in genomics. Many gene selection techniques have been proposed so far are either based on relevancy or redundancy measure. Further, the performance of these techniques has been adjudged through post selection classification accuracy computed through a classifier using the selected genes. This performance metric may be statistically sound but may not be biologically relevant. A statistical approach, i.e. Boot-MRMR, was proposed based on a composite measure of maximum relevance and minimum redundancy, which is both statistically sound and biologically relevant for informative gene selection. For comparative evaluation of the proposed approach, we developed two biological sufficient criteria, i.e. Gene Set Enrichment with QTL (GSEQ) and biological similarity score based on Gene Ontology (GO). Further, a systematic and rigorous evaluation of the proposed technique with 12 existing gene selection techniques was carried out using five gene expression datasets. This evaluation was based on a broad spectrum of statistically sound (e.g. subject classification) and biological relevant (based on QTL and GO) criteria under a multiple criteria decision-making framework. The performance analysis showed that the proposed technique selects informative genes which are more biologically relevant. The proposed technique is also found to be quite competitive with the existing techniques with respect to subject classification and computational time. Our results also showed that under the multiple criteria decision-making setup, the proposed technique is best for informative gene selection over the available alternatives. Based on the proposed approach, an R Package, i.e. BootMRMR has been developed and available at https://cran.r-project.org/web/packages/BootMRMR. This study will provide a practical guide to select statistical techniques for selecting informative genes from high dimensional expression data for breeding and system biology studies.


Asunto(s)
Algoritmos , Interpretación Estadística de Datos , Perfilación de la Expresión Génica/estadística & datos numéricos , Genes , Análisis de Secuencia por Matrices de Oligonucleótidos/estadística & datos numéricos , Biología Computacional/métodos , Perfilación de la Expresión Génica/métodos , Ontología de Genes , Genes/fisiología , Genoma Humano , Genómica/métodos , Genómica/estadística & datos numéricos , Humanos , Análisis de Secuencia por Matrices de Oligonucleótidos/métodos , Tamaño de la Muestra
13.
Sci Rep ; 8(1): 2391, 2018 02 05.
Artículo en Inglés | MEDLINE | ID: mdl-29402907

RESUMEN

The analysis of gene sets is usually carried out based on gene ontology terms and known biological pathways. These approaches may not establish any formal relation between genotype and trait specific phenotype. In plant biology and breeding, analysis of gene sets with trait specific Quantitative Trait Loci (QTL) data are considered as great source for biological knowledge discovery. Therefore, we proposed an innovative statistical approach called Gene Set Analysis with QTLs (GSAQ) for interpreting gene expression data in context of gene sets with traits. The utility of GSAQ was studied on five different complex abiotic and biotic stress scenarios in rice, which yields specific trait/stress enriched gene sets. Further, the GSAQ approach was more innovative and effective in performing gene set analysis with underlying QTLs and identifying QTL candidate genes than the existing approach. The GSAQ approach also provided two potential biological relevant criteria for performance analysis of gene selection methods. Based on this proposed approach, an R package, i.e., GSAQ ( https://cran.r-project.org/web/packages/GSAQ ) has been developed. The GSAQ approach provides a valuable platform for integrating the gene expression data with genetically rich QTL data.


Asunto(s)
Biología Computacional/métodos , Genotipo , Modelos Estadísticos , Oryza/genética , Fenotipo , Fitomejoramiento/métodos , Sitios de Carácter Cuantitativo , Genes de Plantas , Oryza/fisiología , Estrés Fisiológico
14.
PLoS One ; 12(1): e0169605, 2017.
Artículo en Inglés | MEDLINE | ID: mdl-28056073

RESUMEN

Selection of informative genes is an important problem in gene expression studies. The small sample size and the large number of genes in gene expression data make the selection process complex. Further, the selected informative genes may act as a vital input for gene co-expression network analysis. Moreover, the identification of hub genes and module interactions in gene co-expression networks is yet to be fully explored. This paper presents a statistically sound gene selection technique based on support vector machine algorithm for selecting informative genes from high dimensional gene expression data. Also, an attempt has been made to develop a statistical approach for identification of hub genes in the gene co-expression network. Besides, a differential hub gene analysis approach has also been developed to group the identified hub genes into various groups based on their gene connectivity in a case vs. control study. Based on this proposed approach, an R package, i.e., dhga (https://cran.r-project.org/web/packages/dhga) has been developed. The comparative performance of the proposed gene selection technique as well as hub gene identification approach was evaluated on three different crop microarray datasets. The proposed gene selection technique outperformed most of the existing techniques for selecting robust set of informative genes. Based on the proposed hub gene identification approach, a few number of hub genes were identified as compared to the existing approach, which is in accordance with the principle of scale free property of real networks. In this study, some key genes along with their Arabidopsis orthologs has been reported, which can be used for Aluminum toxic stress response engineering in soybean. The functional analysis of various selected key genes revealed the underlying molecular mechanisms of Aluminum toxic stress response in soybean.


Asunto(s)
Aluminio/toxicidad , Glycine max/efectos de los fármacos , Glycine max/genética , Algoritmos , Arabidopsis/efectos de los fármacos , Arabidopsis/genética , Redes Reguladoras de Genes/efectos de los fármacos , Redes Reguladoras de Genes/genética
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA