Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 29
Filtrar
1.
Stat Probab Lett ; 1672020 Dec.
Artículo en Inglés | MEDLINE | ID: mdl-33304024

RESUMEN

When a statistical test is repeatedly applied to rows of a data matrix, correlations among data rows will give rise to correlations among corresponding test statistics. We investigate the relationship between test-statistic correlation and data-row correlation and discuss its implications.

2.
PLoS Genet ; 9(6): e1003496, 2013 Jun.
Artículo en Inglés | MEDLINE | ID: mdl-23818858

RESUMEN

The ascomycete fungus Tolypocladium inflatum, a pathogen of beetle larvae, is best known as the producer of the immunosuppressant drug cyclosporin. The draft genome of T. inflatum strain NRRL 8044 (ATCC 34921), the isolate from which cyclosporin was first isolated, is presented along with comparative analyses of the biosynthesis of cyclosporin and other secondary metabolites in T. inflatum and related taxa. Phylogenomic analyses reveal previously undetected and complex patterns of homology between the nonribosomal peptide synthetase (NRPS) that encodes for cyclosporin synthetase (simA) and those of other secondary metabolites with activities against insects (e.g., beauvericin, destruxins, etc.), and demonstrate the roles of module duplication and gene fusion in diversification of NRPSs. The secondary metabolite gene cluster responsible for cyclosporin biosynthesis is described. In addition to genes necessary for cyclosporin biosynthesis, it harbors a gene for a cyclophilin, which is a member of a family of immunophilins known to bind cyclosporin. Comparative analyses support a lineage specific origin of the cyclosporin gene cluster rather than horizontal gene transfer from bacteria or other fungi. RNA-Seq transcriptome analyses in a cyclosporin-inducing medium delineate the boundaries of the cyclosporin cluster and reveal high levels of expression of the gene cluster cyclophilin. In medium containing insect hemolymph, weaker but significant upregulation of several genes within the cyclosporin cluster, including the highly expressed cyclophilin gene, was observed. T. inflatum also represents the first reference draft genome of Ophiocordycipitaceae, a third family of insect pathogenic fungi within the fungal order Hypocreales, and supports parallel and qualitatively distinct radiations of insect pathogens. The T. inflatum genome provides additional insight into the evolution and biosynthesis of cyclosporin and lays a foundation for further investigations of the role of secondary metabolite gene clusters and their metabolites in fungal biology.


Asunto(s)
Escarabajos/microbiología , Ciclosporina/metabolismo , Hypocreales/genética , Complejos Multienzimáticos/genética , Péptido Sintasas/genética , Animales , Evolución Molecular , Transferencia de Gen Horizontal , Genoma , Hypocreales/enzimología , Complejos Multienzimáticos/metabolismo , Familia de Multigenes , Péptido Sintasas/metabolismo , Filogenia , Análisis de Secuencia de ARN
3.
Environ Monit Assess ; 188(1): 42, 2016 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-26687085

RESUMEN

Cluster analysis (CA), discriminant analysis (DA), and principal component analysis/factor analysis (PCA/FA) were used to analyze the interannual, seasonal, and spatial variations of water quality from 1991 to 2011 in controlling points (Xinzhuang Bridge, Daguan Bridge) of the main rivers (Chaohe River, Baihe River) flowing into the Miyun Reservoir. The results demonstrated that total nitrogen (TN) and total phosphorus (TP) exceeded China National Standard II for surface water separately 5.08 times and 1 time. CA showed that the water quality could be divided into three interannual (IA) groups: IAI (1991-1995, 1998), IAII (1996-1997, 1999-2000, 2002-2006), and IAIII (2001, 2007-2011) and two seasonal clusters: dry season 1 (December), dry season 2 (January-February), and non-dry season (March-November). At interannual scale, the higher concentration of SO4 (2-) from industrial activities, atmospheric sedimentation, and fertilizer use in IAIII accelerated dissolution of carbonate, which increased Ca(2+), Mg(2+), total hardness (T-Hard), and total alkalinity (T-Alk). The decreasing trend of CODMn contributed to the establishment of sewage treatment plants and water and soil conservation in the Miyun upstream. The changing trend of NO3 (-)-N indicated increasing non-point pollution load of IAII and effective non-point pollution controlling of IAIII. Only one parameter T in the seasonal scale verified improved non-point pollution controlling. The major pollution in two controlling points was NO3 (-)-N, T-Hard, TN, and other ion pollution (SO4 (2-), F(-), Ca(2+), Mg(2+), T-Hard, T-Alk). Higher concentration of NO3 (-)-N in Xinzhuang and CODMn in Daguan indicated different controlling measures, especially controlling agriculture intensification in Chaohe River to decrease N pollution and decreasing water and soil loss and cage culture in Baihe River to weaken organic pollution. Controlling SO4 (2-) from industrial activity, atmospheric sedimentation and fertilizer use in watershed can effectively control Ca(2+), Mg(2+), T-Hard, and T-Alk.


Asunto(s)
Monitoreo del Ambiente , Ríos/química , Contaminación del Agua/estadística & datos numéricos , Agricultura , China , Análisis por Conglomerados , Fertilizantes/análisis , Nitrógeno/análisis , Fósforo/análisis , Análisis de Componente Principal , Estaciones del Año , Contaminantes del Agua/análisis , Calidad del Agua
4.
J Exp Bot ; 65(20): 5889-902, 2014 Nov.
Artículo en Inglés | MEDLINE | ID: mdl-25135520

RESUMEN

Transcriptional studies in relation to fruit ripening generally aim to identify the transcriptional states associated with physiological ripening stages and the transcriptional changes between stages within the ripening programme. In non-climacteric fruits such as grape, all ripening-related genes involved in this programme have not been identified, mainly due to the lack of mutants for comparative transcriptomic studies. A feature in grape cluster ripening (Vitis vinifera cv. Pinot noir), where all berries do not initiate the ripening at the same time, was exploited to study their shifted ripening programmes in parallel. Berries that showed marked ripening state differences in a véraison-stage cluster (ripening onset) ultimately reached similar ripeness states toward maturity, indicating the flexibility of the ripening programme. The expression variance between these véraison-stage berry classes, where 11% of the genes were found to be differentially expressed, was reduced significantly toward maturity, resulting in the synchronization of their transcriptional states. Defined quantitative expression changes (transcriptional distances) not only existed between the véraison transitional stages, but also between the véraison to maturity stages, regardless of the berry class. It was observed that lagging berries complete their transcriptional programme in a shorter time through altered gene expressions and ripening-related hormone dynamics, and enhance the rate of physiological ripening progression. Finally, the reduction in expression variance of genes can identify new genes directly associated with ripening and also assess the relevance of gene activity to the phase of the ripening programme.


Asunto(s)
Frutas/genética , Regulación de la Expresión Génica de las Plantas , Transcripción Genética , Vitis/genética , Frutas/crecimiento & desarrollo , Frutas/fisiología , Perfilación de la Expresión Génica , Análisis de Secuencia por Matrices de Oligonucleótidos , Reguladores del Crecimiento de las Plantas/metabolismo , Factores de Tiempo , Vitis/crecimiento & desarrollo , Vitis/fisiología
5.
Stat Appl Genet Mol Biol ; 12(1): 49-70, 2013 Mar 26.
Artículo en Inglés | MEDLINE | ID: mdl-23502340

RESUMEN

RNA sequencing (RNA-Seq) is the current method of choice for characterizing transcriptomes and quantifying gene expression changes. This next generation sequencing-based method provides unprecedented depth and resolution. The negative binomial (NB) probability distribution has been shown to be a useful model for frequencies of mapped RNA-Seq reads and consequently provides a basis for statistical analysis of gene expression. Negative binomial exact tests are available for two-group comparisons but do not extend to negative binomial regression analysis, which is important for examining gene expression as a function of explanatory variables and for adjusted group comparisons accounting for other factors. We address the adequacy of available large-sample tests for the small sample sizes typically available from RNA-Seq studies and consider a higher-order asymptotic (HOA) adjustment to likelihood ratio tests. We demonstrate that 1) the HOA-adjusted likelihood ratio test is practically indistinguishable from the exact test in situations where the exact test is available, 2) the type I error of the HOA test matches the nominal specification in regression settings we examined via simulation, and 3) the power of the likelihood ratio test does not appear to be affected by the HOA adjustment. This work helps clarify the accuracy of the unadjusted likelihood ratio test and the degree of improvement available with the HOA adjustment. Furthermore, the HOA test may be preferable even when the exact test is available because it does not require ad hoc library size adjustments.


Asunto(s)
Perfilación de la Expresión Génica/métodos , Modelos Genéticos , Análisis de Secuencia de ARN , Algoritmos , Arabidopsis/genética , Secuencia de Bases , Simulación por Computador , Secuenciación de Nucleótidos de Alto Rendimiento , Funciones de Verosimilitud , Modelos Estadísticos , Distribución de Poisson , Pseudomonas syringae/genética , ARN Bacteriano/genética , ARN de Planta/genética , Análisis de Regresión
6.
BMC Plant Biol ; 13: 92, 2013 Jun 25.
Artículo en Inglés | MEDLINE | ID: mdl-23799904

RESUMEN

BACKGROUND: Cytosine DNA methylation (5mC) is an epigenetic modification that is important to genome stability and regulation of gene expression. Perturbations of 5mC have been implicated as a cause of phenotypic variation among plants regenerated through in vitro culture systems. However, the pattern of change in 5mC and its functional role with respect to gene expression, are poorly understood at the genome scale. A fuller understanding of how 5mC changes during in vitro manipulation may aid the development of methods for reducing or amplifying the mutagenic and epigenetic effects of in vitro culture and plant transformation. RESULTS: We investigated the in vitro methylome of the model tree species Populus trichocarpa in a system that mimics routine methods for regeneration and plant transformation in the genus Populus (poplar). Using methylated DNA immunoprecipitation followed by high-throughput sequencing (MeDIP-seq), we compared the methylomes of internode stem segments from micropropagated explants, dedifferentiated calli, and internodes from regenerated plants. We found that more than half (56%) of the methylated portion of the genome appeared to be differentially methylated among the three tissue types. Surprisingly, gene promoter methylation varied little among tissues, however, the percentage of body-methylated genes increased from 9% to 14% between explants and callus tissue, then decreased to 8% in regenerated internodes. Forty-five percent of differentially-methylated genes underwent transient methylation, becoming methylated in calli, and demethylated in regenerants. These genes were more frequent in chromosomal regions with higher gene density. Comparisons with an expression microarray dataset showed that genes methylated at both promoters and gene bodies had lower expression than genes that were unmethylated or only promoter-methylated in all three tissues. Four types of abundant transposable elements showed their highest levels of 5mC in regenerated internodes. CONCLUSIONS: DNA methylation varies in a highly gene- and chromosome-differential manner during in vitro differentiation and regeneration. 5mC in redifferentiated tissues was not reset to that in original explants during the study period. Hypermethylation of gene bodies in dedifferentiated cells did not interfere with transcription, and may serve a protective role against activation of abundant transposable elements.


Asunto(s)
Desdiferenciación Celular , Populus/citología , Populus/genética , Técnicas de Cultivo de Célula , Células Cultivadas , Citosina/metabolismo , Metilación de ADN , Epigenómica , Populus/fisiología , Transformación Genética
7.
Heliyon ; 9(1): e12862, 2023 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-36691531

RESUMEN

The assessment of different aroma families on tropical fruit aroma perception is still not well understood. This study aimed to investigate the effect of esters and volatile thiols on tropical fruit aroma perception in white wines. Four levels of thiols (none, low, medium and high) and three levels of esters (none, low, medium) were added to a dearomatized white wine base in a full factorial design. Check-All-That-Apply (CATA) was used to determine the aroma descriptors that most differentiated the wines followed by Sensory Descriptive Analysis (SDA) to evaluate the intensity of those significant aroma attributes. More than 78% of the total variance was described in the first two dimensions when using Canonical Variate Analysis. Tropical fruit aromas were associated with wines containing different levels of esters and ester-thiol combinations. Volatile thiols alone imparted an earthy aroma and were grouped with the control wine. The different ester-thiol combinations altered the tropical fruit aroma quality in the wines from citrus to passionfruit, pineapple and guava. Understanding the cause of tropical fruit aroma allows for targeted processing to achieve the desired wine sensory quality.

8.
Genet Epidemiol ; 35 Suppl 1: S115-9, 2011.
Artículo en Inglés | MEDLINE | ID: mdl-22128051

RESUMEN

We summarize the contributions of Group 9 of Genetic Analysis Workshop 17. This group addressed the problems of linkage disequilibrium and other longer range forms of allelic association when evaluating the effects of genotypes on phenotypes. Issues raised by long-range associations, whether a result of selection, stratification, possible technical errors, or chance, were less expected but proved to be important. Most contributors focused on regression methods of various types to illustrate problematic issues or to develop adaptations for dealing with high-density genotype assays. Study design was also considered, as was graphical modeling. Although no method emerged as uniformly successful, most succeeded in reducing false-positive results either by considering clusters of loci within genes or by applying smoothing metrics that required results from adjacent loci to be similar. Two unexpected results that questioned our assumptions of what is required to model linkage disequilibrium were observed. The first was that correlations between loci separated by large genetic distances can greatly inflate single-locus test statistics, and, whether the result of selection, stratification, possible technical errors, or chance, these correlations seem overabundant. The second unexpected result was that applying principal components analysis to genome-wide genotype data can apparently control not only for population structure but also for linkage disequilibrium.


Asunto(s)
Desequilibrio de Ligamiento , Modelos Estadísticos , Epidemiología Molecular/métodos , Gráficos por Computador , Interpretación Estadística de Datos , Variación Estructural del Genoma , Proyecto Genoma Humano , Humanos , Modelos Genéticos , Polimorfismo de Nucleótido Simple , Análisis de Componente Principal , Análisis de Regresión
9.
Matrix Biol Plus ; 15: 100117, 2022 Aug.
Artículo en Inglés | MEDLINE | ID: mdl-35898192

RESUMEN

Increasingly, the matrisome, a set of proteins that form the core of the extracellular matrix (ECM) or are closely associated with it, has been demonstrated to play a key role in tumor progression. However, in the context of gynecological cancers, the matrisome has not been well characterized. A holistic, yet targeted, exploration of the tumor microenvironment is critical for better understanding the progression of gynecological cancers, identifying key biomarkers for cancer progression, establishing the role of gene expression in patient survival, and for assisting in the development of new targeted therapies. In this work, we explored the matrisome gene expression profiles of cervical squamous cell carcinoma and endocervical adenocarcinoma (CESC), uterine corpus endometrial carcinoma (UCEC), and uterine carcinosarcoma (UCS) using publicly available RNA-seq data from The Cancer Genome Atlas (TCGA) and The Genotype-Tissue Expression (GTEx) portal. We hypothesized that the matrisomal expression patterns of CESC, UCEC, and UCS would be highly distinct with respect to genes which are differentially expressed and hold inferential significance with respect to tumor progression, patient survival, or both. Through a combination of statistical and machine learning analysis techniques, we identified sets of genes and gene networks which characterized each of the gynecological cancer cohorts. Our findings demonstrate that the matrisome is critical for characterizing gynecological cancers and transcriptomic mechanisms of cancer progression and outcome. Furthermore, while the goal of pan-cancer transcriptional analyses is often to highlight the shared attributes of these cancer types, we demonstrate that they are highly distinct diseases which require separate analysis, modeling, and treatment approaches. In future studies, matrisome genes and gene ontology terms that were identified as holding inferential significance for cancer stage and patient survival can be evaluated as potential drug targets and incorporated into in vitro models of disease.

10.
Huan Jing Ke Xue ; 43(1): 256-266, 2022 Jan 08.
Artículo en Zh | MEDLINE | ID: mdl-34989510

RESUMEN

Due to the limitations of the treatment process of urban sewage treatment plants and the complexity of water sources, the rich inorganic nitrogen and trace persistent organic matter in the reclaimed water cause potential human health risks through lateral leakage or bioaccumulation during the replenishment process of rivers and lakes. Exploring the distribution law of different types of reclaimed water characteristic water quality factors and their formation in reclaimed water replenishment river channels is of great significance to river and lake management. This study takes the Beijing-Tianjin-Hebei section of the North Canal as the research area and explores the spatial variation characteristics of conventional physical parameters, full index, inorganic nitrogen, and salinity hydronium antibiotics in river water quality with the help of clustering, discrimination, principal components, and variance decomposition. The results showed that, although the spatial distribution patterns of different types of water quality factors were consistent, they all showed significant mid-upstream and downstream distribution; however, there were big differences in the degree of variation and the mechanism of variation. The spatial variation of inorganic nitrogen and antibiotics was the most obvious, whereas the variation in conventional physical parameters and the full index was the weakest, and the salinity hydronium showed moderate variation. The spatial variation mechanism of conventional physical parameters was mainly reflected in microbial degradation. The full index was the result of the combined effect of microorganisms, diffusion, the synergy of the two, and a certain degree of source-sink homogeneity. Diffusion was the main mechanism affecting the spatial variation in salinity hydronium. The spatial variation mechanism of inorganic nitrogen was mainly reflected in the source-sink homogenization and microbial degradation; as a secondary mechanism of the spatial variation of inorganic nitrogen, diffusion had a synergistic mode with microbial degradation. Antibiotics, which have great differences in chemical structural stability and biodegradability, showed high spatial variability and had the highest diffusion and microbial synergy mechanism. This research provides a quantitative analysis of the spatial variability mechanism of water quality based on variance decomposition, which has practical guiding significance for the causes of the spatial variability of river pollutants and river management.


Asunto(s)
Contaminantes Químicos del Agua , Calidad del Agua , China , Monitoreo del Ambiente , Humanos , Lagos , Nitrógeno/análisis , Ríos , Contaminantes Químicos del Agua/análisis
11.
Huan Jing Ke Xue ; 43(2): 803-812, 2022 Feb 08.
Artículo en Zh | MEDLINE | ID: mdl-35075854

RESUMEN

Reclaimed water plays an important role in alleviating the shortage of urban water resources; however, the trace pollutants and pathogens in reclaimed water have an effect on the plankton community in the receiving water. This study investigated the spatial variation mechanism of microbial community diversity in the Beijing-Tianjin-Hebei reach of the Nordkanal River based on the OTUs and phylum level fragment number and fragment abundance data matrix. The results showed that the physical and chemical disturbance caused by the frequent inflow of reclaimed water changed the hydrology and water quality of the water body, and the plankton community could be divided into two different groups along the geographical scale:the medium and upstream clustering (MUC) and the downstream clustering (DC). The analysis of diversity index based on the OTUs data matrix showed that the species diversity of the DC group was significantly higher than that of the MUC group, and the abundance distribution and evenness showed the opposite trend. The species richness was mainly determined by the fragment diversity of the occasional microflora; the evenness was mainly determined by the variation of the abundance of the dominant microflora; the sensitivity of the subcommunity structure with different abundance levels to spatial change was in the order of non-dominant microflora > occasional microflora > dominant microflora. The diversity analysis of the data matrix based on phylum level also showed that the species diversity of the DC group was significantly higher than that of the MUC group, and the change trend of abundance was the opposite; the most sensitive microflora group was the non-dominant phyla, followed by the occasional phyla, and the dominant phyla group was the least sensitive. The data matrix based on the number of level segments of the gate was more sensitive to environmental changes than the multi-degree data matrix based on the level of the gate. The environmental factors significantly related to microbial community were turbidity; permanganate index; oxidation-reduction potential (ORP); macrolide (MLs); tetracycline antibiotic (TCs); and regional response factors of salt ions, carbon, and inorganic nitrogen. In the aspect of abundance and diversity, these phylas that the DC group was significantly more than the MUC group were more significantly negatively correlated with MLs, whereas they were positively correlated with TCs, and these phylas that the MUC group was significantly more than the DC group was more significantly positively correlated with MLs. The research results can provide a theoretical basis and technical guidance for the ecological rehabilitation of urban river courses with reclaimed water as their main water supply source.


Asunto(s)
Microbiota , Plancton , Beijing , Microbiota/genética , Plancton/genética , Ríos , Calidad del Agua
12.
Huan Jing Ke Xue ; 42(11): 5424-5432, 2021 Nov 08.
Artículo en Zh | MEDLINE | ID: mdl-34708981

RESUMEN

As the bridge of pollutant exchange between sediments and aquatic ecosystems, microorganisms play an important role in material circulation. However, there are few comparative studies of microorganisms in water and sediment of urban rivers with unconventional water supply, sluice dam, and lining closure. The highly artificial area of Beijing-Tianjin-Hebei section of the North Canal was chosen for this study. We analyze the differences of microbial community composition in water and sediment using high-throughput sequencing. The results show that the microbial communities in the sediments of the North Canal have higher α-diversity than those in the water. With regards to ß-diversity, the similarity of microbial communities in the water is higher than that in the sediment. There is no significant difference in the abundance of Proteobacteria between water and sediments. The abundance of α-Proteobacteria, Actinobacteria, Cyanobacteria, and Verrucomicrobia was higher in water than that of sediment, while the abundance of γ-Proteobacteria, δ-Proteobacteria, Chloroflexi, Firmicutes, and Acidobacteria was higher in sediments than that of water. Aerobic or facultative anaerobes dominated the microbial aquatic system, while anaerobes dominated the sediments. The risk of bacteria releasing pathogens from the sediment into the water habitat is high. The research results provide a scientific basis for revealing the mechanism of microbial community change under river pollution risk in highly artificial reclaimed water.


Asunto(s)
Cianobacterias , Microbiota , Sedimentos Geológicos , Secuenciación de Nucleótidos de Alto Rendimiento , Microbiota/genética , ARN Ribosómico 16S/genética , Agua
13.
Huan Jing Ke Xue ; 42(5): 2287-2295, 2021 May 08.
Artículo en Zh | MEDLINE | ID: mdl-33884798

RESUMEN

Sediment bacteria have attracted much attention because of their important roles in energy flow and pollutant cycle transformation. The changes in the spatial distribution pattern of bacteria are the basis for research on the biodiversity generation and maintenance mechanisms. However, there are few studies on the spatial variation in benthic microorganisms and its biogeographic models. The highly artificial North Canal River across the Beijing-Tianjin-Hebei area was chosen as the research area in this study. The spatial variation in the different classification levels of the Kingdom, Phylum, Class, Order, Family, Genus, Species, and operational taxonomic units and their diversity formation mechanisms were analyzed. The results showed that the samples at different classification levels had a more homogeneous distribution pattern. There were clearer distribution boundaries at the low classification levels than at the high classification levels. The significance of the bacterial community variation increased as the classification level of the bacterial community decreased. Furthermore, the difference between groups increased and the similarities within groups decreased as the classification level of the bacterial community decreased. The typical rhizosphere microorganisms represented by Frankiales and Rhodobacterales showed significant enrichment in the upstream samples, followed by the midstream samples and a significant decrease in the downstream samples. Microorganisms related to the carbon, nitrogen, and sulfur cycles represented by Anaerolineales and Desulfobacterales showed significant enrichment in the midstream, followed by the downstream and a significant reduction in the upstream. The genus Phenylobacterium was significantly enriched in the upstream followed by the midstream, and was significantly reduced in the downstream. The pathogenic bacteria represented by Clostridium_gasigenes and Moraxella_osloensis showed a significant enrichment pattern in the midstream. The contents of Ca2+, SO42-, and total organic carbon (TOC) in the downstream samples were significantly higher than those in the upstream and midstream samples. The discharge of untreated wastewater downstream increased the salt and TOC contents in the sediment. The ecological restoration project in the sediment of the riparian zone decreased the salt and TOC contents in the upstream and midstream samples. Environmental selection was the main driving factor of the pattern of spatial variation in the bacterial communities in the sediments of the North Canal River.


Asunto(s)
Acuaporinas , Ríos , Bacterias/genética , Beijing , Biodiversidad , China , Monitoreo del Ambiente , Sedimentos Geológicos
14.
Hum Hered ; 68(2): 139-50, 2009.
Artículo en Inglés | MEDLINE | ID: mdl-19439976

RESUMEN

BACKGROUND/AIMS: With pedigree data, genetic linkage can be detected using inheritance vector tests, which explore the discrepancy between the posterior distribution of the inheritance vectors given observed trait values and the prior distribution of the inheritance vectors. In this paper, we propose conditional inheritance vector tests for linkage localization. These conditional tests can also be used to detect additional linkage signals in the presence of previously detected causal genes. METHODS: For linkage localization, we propose to perform inheritance vector tests conditioning on the inheritance vectors at two positions bounding a test region. We can detect additional linkage signals by conducting a further conditional test in a region with no previously detected genes. We use randomized p values to extend the marginal and conditional tests when the inheritance vectors cannot be completely determined from genetic marker data. RESULTS: We conduct simulation studies to compare and contrast the marginal and the conditional tests and to demonstrate that randomized p values can capture both the significance and the uncertainty in the test results. CONCLUSIONS: The simulation results demonstrate that the proposed conditional tests provide useful localization information, and with informative marker data, the uncertainty in randomized marginal and conditional test results is small.


Asunto(s)
Sitios de Carácter Cuantitativo , Femenino , Ligamiento Genético , Humanos , Masculino , Cadenas de Markov , Linaje
15.
Genes (Basel) ; 11(2)2020 02 10.
Artículo en Inglés | MEDLINE | ID: mdl-32050700

RESUMEN

Model-based clustering with finite mixture models has become a widely used clustering method. One of the recent implementations is MCLUST. When objects to be clustered are summary statistics, such as regression coefficient estimates, they are naturally associated with estimation errors, whose covariance matrices can often be calculated exactly or approximated using asymptotic theory. This article proposes an extension to Gaussian finite mixture modeling-called MCLUST-ME-that properly accounts for the estimation errors. More specifically, we assume that the distribution of each observation consists of an underlying true component distribution and an independent measurement error distribution. Under this assumption, each unique value of estimation error covariance corresponds to its own classification boundary, which consequently results in a different grouping from MCLUST. Through simulation and application to an RNA-Seq data set, we discovered that under certain circumstances, explicitly, modeling estimation errors, improves clustering performance or provides new insights into the data, compared with when errors are simply ignored, whereas the degree of improvement depends on factors such as the distribution of error covariance matrices.


Asunto(s)
Perfilación de la Expresión Génica/métodos , Modelos Estadísticos , Algoritmos , Análisis por Conglomerados , Distribución Normal , Probabilidad , RNA-Seq/estadística & datos numéricos , Proyectos de Investigación , Incertidumbre
16.
PeerJ ; 6: e5199, 2018.
Artículo en Inglés | MEDLINE | ID: mdl-30013849

RESUMEN

The accumulation of RNA sequencing (RNA-Seq) gene expression data in recent years has resulted in large and complex data sets of high dimensions. Exploratory analysis, including data mining and visualization, reveals hidden patterns and potential outliers in such data, but is often challenged by the high dimensional nature of the data. The scatterplot matrix is a commonly used tool for visualizing multivariate data, and allows us to view multiple bivariate relationships simultaneously. However, the scatterplot matrix becomes less effective for high dimensional data because the number of bivariate displays increases quadratically with data dimensionality. In this study, we introduce a selection criterion for each bivariate scatterplot and design/implement an algorithm that automatically scan and rank all possible scatterplots, with the goal of identifying the plots in which separation between two pre-defined groups is maximized. By applying our method to a multi-experiment Arabidopsis RNA-Seq data set, we were able to successfully pinpoint the visualization angles where genes from two biological pathways are the most separated, as well as identify potential outliers.

17.
Huan Jing Ke Xue ; 38(2): 743-751, 2017 Feb 08.
Artículo en Zh | MEDLINE | ID: mdl-29964534

RESUMEN

In order to study the effect of reclaimed water on bacterial community composition and function in urban river sediment, the changes of bacteria community diversity, composition and function in Mayu wetland upon the supply of reclaimed water were investigated by a range of sophisticated procedures, including Terminal Restriction Fragment Length Polymorphism(T-RFLP), 16S rRNA clone library technology, and Real-time Quantitative PCR Detecting System(qPCR).The results showed that carbon, nitrogen and phosphorus were major factors driving the variation of bacterial diversity and community structure in river sediment, and the bacteria were gradually recovered after purification in downstream under the effect of artificial wetland. In addition, the bacterial community in reclaimed water outfall was mainly constituted by ß-Proteobacteria, δ-Proteobacteria, Bacteroidales and Cyanobacteriain, and ε-Proteobacteria, Chloroflexi and Spirochaetes were unique groups. Besides, the major biological geochemical cycle was nitrogen, carbon and phosphorus cycle in river sediment, which was closely related to functional genes. There were about 45.9% of the clones related to nitrogen cycle in reclaimed water outfall, such as Comamonas sp., higher than those of upstream and downstream (27.7% and 23.4%), 17.9% of the clones were closely related to the carbon cycle, such as Lysobacter sp., higher than those of upstream and downstream (14.4% and 12.9%). Furthermore, the trace of pathogenic bacteria and antibiotics in reclaimed water also changed the transformation pattern participating in carbon and nitrogen cycle, for example, Rhodocyclus sp. conducted nitrogen fixation by photosynthesis in reclaimed water outfall, whereas Burkholderia sp. fixes nitrogen by ways of plants symbiotic nitrogen fixation in upstream and downstream. This research provides theoretical reference for studies on remediation of reclaimed water supplying river by artificial wetland.


Asunto(s)
Bacterias/clasificación , Sedimentos Geológicos/microbiología , Microbiología del Agua , Humedales , Ciudades , Filogenia , ARN Ribosómico 16S , Ríos , Agua , Purificación del Agua
18.
PeerJ ; 4: e2791, 2016.
Artículo en Inglés | MEDLINE | ID: mdl-28028467

RESUMEN

We examined RNA-Seq data on 211 biological samples from 24 different Arabidopsis experiments carried out by different labs. We grouped the samples according to tissue types, and in each of the groups, we identified genes that are stably expressed across biological samples, treatment conditions, and experiments. We fit a Poisson log-linear mixed-effect model to the read counts for each gene and decomposed the total variance into between-sample, between-treatment and between-experiment variance components. Identifying stably expressed genes is useful for count normalization and differential expression analysis. The variance component analysis that we explore here is a first step towards understanding the sources and nature of the RNA-Seq count variation. When using a numerical measure to identify stably expressed genes, the outcome depends on multiple factors: the background sample set and the reference gene set used for count normalization, the technology used for measuring gene expression, and the specific numerical stability measure used. Since differential expression (DE) is measured by relative frequencies, we argue that DE is a relative concept. We advocate using an explicit reference gene set for count normalization to improve interpretability of DE results, and recommend using a common reference gene set when analyzing multiple RNA-Seq experiments to avoid potential inconsistent conclusions.

19.
G3 (Bethesda) ; 6(3): 731-41, 2016 Jan 22.
Artículo en Inglés | MEDLINE | ID: mdl-26801645

RESUMEN

The ability of a fungus to infect novel hosts is dependent on changes in gene content, expression, or regulation. Examining gene expression under simulated host conditions can explore which genes may contribute to host jumping. Insect pathogenesis is the inferred ancestral character state for species of Tolypocladium, however several species are parasites of truffles, including Tolypocladium ophioglossoides. To identify potentially crucial genes in this interkingdom host switch, T. ophioglossoides was grown on four media conditions: media containing the inner and outer portions of its natural host (truffles of Elaphomyces), cuticles from an ancestral host (beetle), and a rich medium (Yeast Malt). Through high-throughput RNASeq of mRNA from these conditions, many differentially expressed genes were identified in the experiment. These included PTH11-related G-protein-coupled receptors (GPCRs) hypothesized to be involved in host recognition, and also found to be upregulated in insect pathogens. A divergent chitinase with a signal peptide was also found to be highly upregulated on media containing truffle tissue, suggesting an exogenous degradative activity in the presence of the truffle host. The adhesin gene, Mad1, was highly expressed on truffle media as well. A BiNGO analysis of overrepresented GO terms from genes expressed during each growth condition found that genes involved in redox reactions and transmembrane transport were the most overrepresented during T. ophioglossoides growth on truffle media, suggesting their importance in growth on fungal tissue as compared to other hosts and environments. Genes involved in secondary metabolism were most highly expressed during growth on insect tissue, suggesting that their products may not be necessary during parasitism of Elaphomyces. This study provides clues into understanding genetic mechanisms underlying the transition from insect to truffle parasitism.


Asunto(s)
Ascomicetos/fisiología , Regulación Fúngica de la Expresión Génica , Genes Fúngicos , Ascomicetos/clasificación , Quitinasas/genética , Quitinasas/metabolismo , Análisis por Conglomerados , Biología Computacional/métodos , Perfilación de la Expresión Génica , Ontología de Genes , Filogenia , Receptores Acoplados a Proteínas G/genética , Receptores Acoplados a Proteínas G/metabolismo , Metabolismo Secundario/genética
20.
Stat Interface ; 8(4): 405-418, 2015.
Artículo en Inglés | MEDLINE | ID: mdl-28042360

RESUMEN

We consider negative binomial (NB) regression models for RNA-Seq read counts and investigate an approach where such NB regression models are fitted to individual genes separately and, in particular, the NB dispersion parameter is estimated from each gene separately without assuming commonalities between genes. This single-gene approach contrasts with the more widely-used dispersion-modeling approach where the NB dispersion is modeled as a simple function of the mean or other measures of read abundance, and then estimated from a large number of genes combined. We show that through the use of higher-order asymptotic techniques, inferences with correct type I errors can be made about the regression coefficients in a single-gene NB regression model even when the dispersion is unknown and the sample size is small. The motivations for studying single-gene models include: 1) they provide a basis of reference for understanding and quantifying the power-robustness trade-offs of the dispersion-modeling approach; 2) they can also be potentially useful in practice if moderate sample sizes become available and diagnostic tools indicate potential problems with simple models of dispersion.

SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA