Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 117
Filtrar
Mais filtros

Base de dados
Tipo de documento
Intervalo de ano de publicação
1.
PLoS Comput Biol ; 17(6): e1009118, 2021 06.
Artigo em Inglês | MEDLINE | ID: mdl-34138847

RESUMO

The single-cell RNA sequencing (scRNA-seq) technologies obtain gene expression at single-cell resolution and provide a tool for exploring cell heterogeneity and cell types. As the low amount of extracted mRNA copies per cell, scRNA-seq data exhibit a large number of dropouts, which hinders the downstream analysis of the scRNA-seq data. We propose a statistical method, SDImpute (Single-cell RNA-seq Dropout Imputation), to implement block imputation for dropout events in scRNA-seq data. SDImpute automatically identifies the dropout events based on the gene expression levels and the variations of gene expression across similar cells and similar genes, and it implements block imputation for dropouts by utilizing gene expression unaffected by dropouts from similar cells. In the experiments, the results of the simulated datasets and real datasets suggest that SDImpute is an effective tool to recover the data and preserve the heterogeneity of gene expression across cells. Compared with the state-of-the-art imputation methods, SDImpute improves the accuracy of the downstream analysis including clustering, visualization, and differential expression analysis.


Assuntos
RNA-Seq/estatística & dados numéricos , Análise de Célula Única/estatística & dados numéricos , Software , Animais , Análise por Conglomerados , Biologia Computacional , Simulação por Computador , Interpretação Estatística de Dados , Visualização de Dados , Bases de Dados de Ácidos Nucleicos/estatística & dados numéricos , Perfilação da Expressão Gênica/estatística & dados numéricos , Técnicas Genéticas/estatística & dados numéricos , Humanos , RNA Mensageiro/genética , RNA Mensageiro/isolamento & purificação
2.
PLoS Comput Biol ; 15(8): e1007293, 2019 08.
Artigo em Inglês | MEDLINE | ID: mdl-31425522

RESUMO

The Long interspersed nuclear element 1 (LINE-1) is a primary source of genetic variation in humans and other mammals. Despite its importance, LINE-1 activity remains difficult to study because of its highly repetitive nature. Here, we developed and validated a method called TeXP to gauge LINE-1 activity accurately. TeXP builds mappability signatures from LINE-1 subfamilies to deconvolve the effect of pervasive transcription from autonomous LINE-1 activity. In particular, it apportions the multiple reads aligned to the many LINE-1 instances in the genome into these two categories. Using our method, we evaluated well-established cell lines, cell-line compartments and healthy tissues and found that the vast majority (91.7%) of transcriptome reads overlapping LINE-1 derive from pervasive transcription. We validated TeXP by independently estimating the levels of LINE-1 autonomous transcription using ddPCR, finding high concordance. Next, we applied our method to comprehensively measure LINE-1 activity across healthy somatic cells, while backing out the effect of pervasive transcription. Unexpectedly, we found that LINE-1 activity is present in many normal somatic cells. This finding contrasts with earlier studies showing that LINE-1 has limited activity in healthy somatic tissues, except for neuroprogenitor cells. Interestingly, we found that the amount of LINE-1 activity was associated with the with the amount of cell turnover, with tissues with low cell turnover rates (e.g. the adult central nervous system) showing lower LINE-1 activity. Altogether, our results show how accounting for pervasive transcription is critical to accurately quantify the activity of highly repetitive regions of the human genome.


Assuntos
Elementos de DNA Transponíveis/genética , Elementos Nucleotídeos Longos e Dispersos/genética , Modelos Genéticos , Transcrição Gênica , Animais , Linhagem Celular , Biologia Computacional , Técnicas Genéticas/estatística & dados numéricos , Genoma Humano , Humanos , Análise de Sequência de RNA/estatística & dados numéricos
3.
Nat Rev Genet ; 11(3): 191-203, 2010 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-20125086

RESUMO

Methylation of cytosine bases in DNA provides a layer of epigenetic control in many eukaryotes that has important implications for normal biology and disease. Therefore, profiling DNA methylation across the genome is vital to understanding the influence of epigenetics. There has been a revolution in DNA methylation analysis technology over the past decade: analyses that previously were restricted to specific loci can now be performed on a genome-scale and entire methylomes can be characterized at single-base-pair resolution. However, there is such a diversity of DNA methylation profiling techniques that it can be challenging to select one. This Review discusses the different approaches and their relative merits and introduces considerations for data analysis.


Assuntos
Metilação de DNA/genética , Técnicas Genéticas , Animais , Imunoprecipitação da Cromatina , Biologia Computacional , Ilhas de CpG , Enzimas de Restrição do DNA , Epigênese Genética , Perfilação da Expressão Gênica , Técnicas Genéticas/estatística & dados numéricos , Genômica , Humanos , Técnicas de Amplificação de Ácido Nucleico , Reprodutibilidade dos Testes , Análise de Sequência de DNA , Sulfitos
4.
Curr Opin Organ Transplant ; 21(5): 476-83, 2016 10.
Artigo em Inglês | MEDLINE | ID: mdl-27517501

RESUMO

PURPOSE OF REVIEW: Despite the benefits of islet and pancreas allotransplantation, their widespread use in type 1 diabetes is limited because of the paucity of suitable donors. Porcine xenotransplantation offers an alternative, and advances in genetic modification of pigs have opened up new potential for its clinical use. This review outlines the barriers to successful islet xenotransplantation, and genetic modifications that have been tested to overcome these. RECENT FINDINGS: Islets from pigs lacking α1,3-galactosyltransferase, to prevent hyperacute rejection, are now used as a background strain for further genetic modifications. The instant blood-mediated inflammatory reaction is overcome by expressing complement regulators including CD46, CD55 and CD59. Prevention of immune-mediated rejection mediated by T cells, macrophages and natural killer cells remains a challenge. The use of immunosuppressive antibodies, such as anti-CD154 or anti-CD2, can be protective, and may be useful if they are produced by the islets themselves. SUMMARY: The field of xenotransplantation has benefited enormously from the development of new genetic modification strategies. With the possibility of multiple genetic modifications in the same animal, and a detailed knowledge of the mechanism of xenograft rejection, the challenge now is to develop islets that provide long-term graft survival without systemic immunosuppression.


Assuntos
Técnicas Genéticas/estatística & dados numéricos , Rejeição de Enxerto/imunologia , Transplante das Ilhotas Pancreáticas/métodos , Transplante Heterólogo/métodos , Animais , Humanos , Transplante das Ilhotas Pancreáticas/imunologia , Suínos
5.
Biol Lett ; 9(1): 20121029, 2013 Feb 23.
Artigo em Inglês | MEDLINE | ID: mdl-23221877

RESUMO

A meeting on Biodiversity Technologies was held by the Biodiversity Institute, Oxford on the 27-28 of September 2012 at the Department of Zoology, University of Oxford. The symposium brought together 36 speakers from North America, Australia and across Europe, presenting the latest research on emerging technologies in biodiversity science and conservation. Here we present a perspective on the general trends emerging from the symposium.


Assuntos
Biodiversidade , Conservação dos Recursos Naturais/métodos , Acústica/instrumentação , Telefone Celular/instrumentação , Telefone Celular/estatística & dados numéricos , Bases de Dados Factuais/estatística & dados numéricos , Inglaterra , Técnicas Genéticas/instrumentação , Técnicas Genéticas/estatística & dados numéricos , Genômica/métodos
6.
Nat Genet ; 36(9): 943-7, 2004 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-15340433

RESUMO

A sound epistemological foundation for biological inquiry comes, in part, from application of valid statistical procedures. This tenet is widely appreciated by scientists studying the new realm of high-dimensional biology, or 'omic' research, which involves multiplicity at unprecedented scales. Many papers aimed at the high-dimensional biology community describe the development or application of statistical techniques. The validity of many of these is questionable, and a shared understanding about the epistemological foundations of the statistical methods themselves seems to be lacking. Here we offer a framework in which the epistemological foundation of proposed statistical methods can be evaluated.


Assuntos
Conhecimento , Biologia Molecular/estatística & dados numéricos , Estatística como Assunto/métodos , Técnicas Genéticas/estatística & dados numéricos , Guias como Assunto , Lógica , Reprodutibilidade dos Testes
7.
PLoS Genet ; 5(5): e1000462, 2009 May.
Artigo em Inglês | MEDLINE | ID: mdl-19424414

RESUMO

The p53 tumor suppressor regulates its target genes through sequence-specific binding to DNA response elements (REs). Although numerous p53 REs are established, the thousands more identified by bioinformatics are not easily subjected to comparative functional evaluation. To examine the relationship between RE sequence variation -- including polymorphisms -- and p53 binding, we have developed a multiplex format microsphere assay of protein-DNA binding (MAPD) for p53 in nuclear extracts. Using MAPD we measured sequence-specific p53 binding of doxorubicin-activated or transiently expressed p53 to REs from established p53 target genes and p53 consensus REs. To assess the sensitivity and scalability of the assay, we tested 16 variants of the p21 target sequence and a 62-multiplex set of single nucleotide (nt) variants of the p53 consensus sequence and found many changes in p53 binding that are not captured by current computational binding models. A group of eight single nucleotide polymorphisms (SNPs) was examined and binding profiles closely matched transactivation capability tested in luciferase constructs. The in vitro binding characteristics of p53 in nuclear extracts recapitulated the cellular in vivo transactivation capabilities for eight well-established human REs measured by luciferase assay. Using a set of 26 bona fide REs, we observed distinct binding patterns characteristic of transiently expressed wild type and mutant p53s. This microsphere assay system utilizes biologically meaningful cell extracts in a multiplexed, quantitative, in vitro format that provides a powerful experimental tool for elucidating the functional impact of sequence polymorphism and protein variation on protein/DNA binding in transcriptional networks.


Assuntos
DNA/genética , DNA/metabolismo , Técnicas Genéticas , Proteína Supressora de Tumor p53/metabolismo , Sequência de Bases , Sítios de Ligação/genética , Núcleo Celular/metabolismo , Corantes Fluorescentes , Redes Reguladoras de Genes , Genes p53 , Técnicas Genéticas/estatística & dados numéricos , Humanos , Técnicas In Vitro , Microesferas , Modelos Genéticos , Mutagênese Sítio-Dirigida , Polimorfismo de Nucleotídeo Único , Ligação Proteica , Proteínas Recombinantes/química , Proteínas Recombinantes/genética , Proteínas Recombinantes/metabolismo , Sensibilidade e Especificidade , Proteína Supressora de Tumor p53/química , Proteína Supressora de Tumor p53/genética
8.
Mol Genet Genomics ; 286(3-4): 279-91, 2011 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-21879293

RESUMO

A new sensitive method for multiplex gene-specific methylation analysis was developed using a ligation-based approach combined with a TaqMan-based detection and readout employing universal reporter probes. The approach, termed methylation-specific Ligation Detection Reaction (msLDR), was applied to test 16 loci in 8 different colorectal cancer cells in parallel. These loci encode immune regulatory genes involved in T-cell and natural killer cell activation, whose silencing is associated with the development or progression of colorectal cancer. Parallel analysis of HLA-A, HLA-B, STAT1, B2M, LMP2, LMP7, PA28α, TAP1, TAP2, TAPBP, ULBP2 and ULBP3 by msLDR in eight colorectal cancer cell lines showed preferential methylation at the HLA-B, ULBP2 and ULBB3 loci, but not at the other loci. MsLDR was found to represent a suitable and sensitive method for the detection of distinct methylation patterns as validated by conventional bisulphite Sanger sequencing and COBRA analysis. Since gene silencing by epigenetic mechanisms plays a central role during transformation of a normal differentiated somatic cell into a cancer cell, characterization of the gene methylation status in tumours is a major topic not only in basic research, but also in clinical diagnostics. Due to a very simple workflow, msLDR is likely to be applicable to clinical samples and thus comprises a potential diagnostic tool for clinical purposes.


Assuntos
Metilação de DNA , Técnicas Genéticas , Apresentação de Antígeno/genética , Linhagem Celular Tumoral , Neoplasias Colorretais/química , Neoplasias Colorretais/genética , Neoplasias Colorretais/imunologia , DNA de Neoplasias/química , DNA de Neoplasias/genética , Proteínas Ligadas por GPI/genética , Genes MHC Classe I , Técnicas Genéticas/estatística & dados numéricos , Humanos , Peptídeos e Proteínas de Sinalização Intercelular/genética , Células Matadoras Naturais/imunologia , Miniaturização , Reação em Cadeia da Polimerase/métodos , Linfócitos T/imunologia
9.
Commun Biol ; 4(1): 661, 2021 06 02.
Artigo em Inglês | MEDLINE | ID: mdl-34079046

RESUMO

Detecting changes in the activity of a transcription factor (TF) in response to a perturbation provides insights into the underlying cellular process. Transcription Factor Enrichment Analysis (TFEA) is a robust and reliable computational method that detects positional motif enrichment associated with changes in transcription observed in response to a perturbation. TFEA detects positional motif enrichment within a list of ranked regions of interest (ROIs), typically sites of RNA polymerase initiation inferred from regulatory data such as nascent transcription. Therefore, we also introduce muMerge, a statistically principled method of generating a consensus list of ROIs from multiple replicates and conditions. TFEA is broadly applicable to data that informs on transcriptional regulation including nascent transcription (eg. PRO-Seq), CAGE, histone ChIP-Seq, and accessibility data (e.g., ATAC-Seq). TFEA not only identifies the key regulators responding to a perturbation, but also temporally unravels regulatory networks with time series data. Consequently, TFEA serves as a hypothesis-generating tool that provides an easy, rigorous, and cost-effective means to broadly assess TF activity yielding new biological insights.


Assuntos
Fatores de Transcrição/metabolismo , Mama/citologia , Mama/metabolismo , Linhagem Celular , Sequenciamento de Cromatina por Imunoprecipitação/estatística & dados numéricos , Biologia Computacional/métodos , Simulação por Computador , Dexametasona/farmacologia , Células Epiteliais/metabolismo , Feminino , Regulação da Expressão Gênica , Técnicas Genéticas/estatística & dados numéricos , Células HCT116 , Humanos , Imidazóis/farmacologia , Piperazinas/farmacologia , Receptores de Glucocorticoides/efeitos dos fármacos , Receptores de Glucocorticoides/metabolismo , Fatores de Transcrição/genética , Transcrição Gênica , Proteína Supressora de Tumor p53/genética , Proteína Supressora de Tumor p53/metabolismo
10.
Transgenic Res ; 19(1): 57-65, 2010 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-19533405

RESUMO

This paper illustrates the advantages that a fuzzy-based aggregation method could bring into the validation of a multiplex method for GMO detection (DualChip GMO kit, Eppendorf). Guidelines for validation of chemical, bio-chemical, pharmaceutical and genetic methods have been developed and ad hoc validation statistics are available and routinely used, for in-house and inter-laboratory testing, and decision-making. Fuzzy logic allows summarising the information obtained by independent validation statistics into one synthetic indicator of overall method performance. The microarray technology, introduced for simultaneous identification of multiple GMOs, poses specific validation issues (patterns of performance for a variety of GMOs at different concentrations). A fuzzy-based indicator for overall evaluation is illustrated in this paper, and applied to validation data for different genetically modified elements. Remarks were drawn on the analytical results. The fuzzy-logic based rules were shown to be applicable to improve interpretation of results and facilitate overall evaluation of the multiplex method.


Assuntos
Lógica Fuzzy , Técnicas Genéticas/estatística & dados numéricos , Organismos Geneticamente Modificados/genética , Estudos de Validação como Assunto , Algoritmos , Animais , Coleta de Dados/métodos , Coleta de Dados/estatística & dados numéricos , Interpretação Estatística de Dados , Análise em Microsséries/métodos , Análise em Microsséries/estatística & dados numéricos
11.
BMC Med Res Methodol ; 10: 47, 2010 May 28.
Artigo em Inglês | MEDLINE | ID: mdl-20509879

RESUMO

BACKGROUND: The capacity of multiple comparisons to produce false positive findings in genetic association studies is abundantly clear. To address this issue, the concept of false positive report probability (FPRP) measures "the probability of no true association between a genetic variant and disease given a statistically significant finding". This concept involves the notion of prior probability of an association between a genetic variant and a disease, making it difficult to achieve acceptable levels for the FPRP when the prior probability is low. Increasing the sample size is of limited efficiency to improve the situation. METHODS: To further clarify this problem, the concept of true report probability (TRP) is introduced by analogy to the positive predictive value (PPV) of diagnostic testing. The approach is extended to consider the effects of replication studies. The formula for the TRP after k replication studies is mathematically derived and shown to be only dependent on prior probability, alpha, power, and number of replication studies. RESULTS: Case-control association studies are used to illustrate the TRP concept for replication strategies. Based on power considerations, a relationship is derived between TRP after k replication studies and sample size of each individual study. That relationship enables study designers optimization of study plans. Further, it is demonstrated that replication is efficient in increasing the TRP even in the case of low prior probability of an association and without requiring very large sample sizes for each individual study. CONCLUSIONS: True report probability is a comprehensive and straightforward concept for assessing the validity of positive statistical testing results in association studies. By its extension to replication strategies it can be demonstrated in a transparent manner that replication is highly effective in distinguishing spurious from true associations. Based on the generalized TRP method for replication designs, optimal research strategy and sample size planning become possible.


Assuntos
Reações Falso-Positivas , Técnicas Genéticas/estatística & dados numéricos , Estudos de Casos e Controles , Predisposição Genética para Doença , Humanos , Reprodutibilidade dos Testes , Tamanho da Amostra
12.
BMC Health Serv Res ; 9: 131, 2009 Jul 30.
Artigo em Inglês | MEDLINE | ID: mdl-19643018

RESUMO

BACKGROUND: Molecular oncology testing (MOT) to detect genomic alterations underlying cancer holds promise for improved cancer care. Yet knowledge limitations regarding the delivery of testing services may constrain the translation of scientific advancements into effective health care. METHODS: We conducted a cross-sectional, self-administered, postal survey of active cancer physicians in Ontario, Canada (N = 611) likely to order MOT, and cancer laboratories (N = 99) likely to refer (i.e., referring laboratories) or conduct (i.e., testing laboratories) MOT in 2006, to assess respondents' perceptions of the importance and accessibility of MOT and their preparedness to provide it. RESULTS: 54% of physicians, 63% of testing laboratories and 60% of referring laboratories responded. Most perceived MOT to be important for treatment, diagnosis or prognosis now, and in 5 years (61% - 100%). Yet only 45% of physicians, 59% of testing labs and 53% of referring labs agreed that patients in their region were receiving MOT that is indicated as a standard of care. Physicians and laboratories perceived various barriers to providing MOT, including, among 70% of physicians, a lack of clear guidelines regarding clinical indications, and among laboratories, a lack of funding (73% - 100%). Testing laboratories were confident of their ability to determine whether and which MOT was indicated (77% and 82% respectively), and perceived that key elements of formal and continuing education were helpful (75% - 100%). By contrast, minorities of physicians were confident of their ability to assess whether and which MOT was indicated (46% and 34% respectively), and while majorities considered various continuing educational resources helpful (68% - 75%), only minorities considered key elements of formal education helpful in preparing for MOT (17% - 43%). CONCLUSION: Physicians and laboratory professionals were enthusiastic about the value of MOT for cancer care but most did not believe patients were gaining adequate access to clinically necessary testing. Further, our results suggest that many were ill equipped as individual stakeholders, or as a coordinated system of referral and interpretation, to provide MOT. These challenges should inspire educational, training and other interventions to ensure that developments in molecular oncology can result in optimal cancer care.


Assuntos
Laboratórios , Programas de Rastreamento/métodos , Neoplasias/diagnóstico , Neoplasias/genética , Médicos , Atitude , Estudos Transversais , Feminino , Técnicas Genéticas/estatística & dados numéricos , Testes Genéticos , Pesquisas sobre Atenção à Saúde , Humanos , Masculino , Programas de Rastreamento/estatística & dados numéricos , Ontário , Patologia Clínica
13.
BMC Genomics ; 9: 360, 2008 Jul 31.
Artigo em Inglês | MEDLINE | ID: mdl-18667089

RESUMO

BACKGROUND: The difficulty in elucidating the genetic basis of complex diseases roots in the many factors that can affect the development of a disease. Some of these genetic effects may interact in complex ways, proving undetectable by current single-locus methodology. RESULTS: We have developed an analysis tool called Hypothesis Free Clinical Cloning (HFCC) to search for genome-wide epistasis in a case-control design. HFCC combines a relatively fast computing algorithm for genome-wide epistasis detection, with the flexibility to test a variety of different epistatic models in multi-locus combinations. HFCC has good power to detect multi-locus interactions simulated under a variety of genetic models and noise conditions. Most importantly, HFCC can accomplish exhaustive genome-wide epistasis search with large datasets as demonstrated with a 400,000 SNP set typed on a cohort of Parkinson's disease patients and controls. CONCLUSION: With the current availability of genetic studies with large numbers of individuals and genetic markers, HFCC can have a great impact in the identification of epistatic effects that escape the standard single-locus association analyses.


Assuntos
Epistasia Genética , Técnicas Genéticas , Genoma Humano , Genômica/métodos , Algoritmos , Estudos de Casos e Controles , Estudos de Coortes , Bases de Dados Genéticas , Predisposição Genética para Doença , Técnicas Genéticas/estatística & dados numéricos , Genômica/estatística & dados numéricos , Genótipo , Humanos , Desequilíbrio de Ligação , Análise Multivariada , Doença de Parkinson/genética , Polimorfismo de Nucleotídeo Único , Software
14.
Genetics ; 175(4): 1975-86, 2007 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-17277369

RESUMO

Linkage disequilibrium (LD) analysis in outbred populations uses historical recombinations to detect and fine map quantitative trait loci (QTL). Our objective was to evaluate the effect of various factors on power and precision of QTL detection and to compare LD mapping methods on the basis of regression and identity by descent (IBD) in populations of limited effective population size (N(e)). An 11-cM region with 6-38 segregating single-nucleotide polymorphisms (SNPs) and a central QTL was simulated. After 100 generations of random mating with N(e) of 50, 100, or 200, SNP genotypes and phenotypes were generated on 200, 500, or 1000 individuals with the QTL explaining 2 or 5% of phenotypic variance. To detect and map the QTL, phenotypes were regressed on genotypes or (assumed known) haplotypes, in comparison with the IBD method. Power and precision to detect QTL increased with sample size, marker density, and QTL effect. Power decreased with N(e), but precision was affected little by N(e). Single-marker regression had similar or greater power and precision than other regression models, and was comparable to the IBD method. Thus, for rapid initial screening of samples of adequate size in populations in which drift is the primary force that has created LD, QTL can be detected and mapped by regression on SNP genotypes without recovering haplotypes.


Assuntos
Mapeamento Cromossômico/métodos , Desequilíbrio de Ligação , Locos de Características Quantitativas , Mapeamento Cromossômico/estatística & dados numéricos , Técnicas Genéticas/estatística & dados numéricos , Genética Populacional , Haplótipos , Modelos Genéticos , Fenótipo , Análise de Regressão
15.
Surg Clin North Am ; 88(4): 705-21, v, 2008 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-18672137

RESUMO

Genetic testing for mutations in genes associated with an inherited predisposition to cancer is rapidly moving outside specialty genetic services and into mainstream health care. Surgeons, as front-line providers of cancer care, are uniquely positioned to identify those who may benefit from genetic testing and institute changes to their health care management based on those results. This article provides an overview of the critical elements of the process of genetic testing for cancer susceptibility.


Assuntos
DNA de Neoplasias/análise , Predisposição Genética para Doença , Técnicas Genéticas/estatística & dados numéricos , Neoplasias/diagnóstico , Neoplasias/genética , Humanos , Reprodutibilidade dos Testes
16.
J Appl Genet ; 49(1): 81-92, 2008.
Artigo em Inglês | MEDLINE | ID: mdl-18263973

RESUMO

We analysed data from a selective DNA pooling experiment with 130 individuals of the arctic fox (Alopex lagopus), which originated from 2 different types regarding body size. The association between alleles of 6 selected unlinked molecular markers and body size was tested by using univariate and multinomial logistic regression models, applying odds ratio and test statistics from the power divergence family. Due to the small sample size and the resulting sparseness of the data table, in hypothesis testing we could not rely on the asymptotic distributions of the tests. Instead, we tried to account for data sparseness by (i) modifying confidence intervals of odds ratio; (ii) using a normal approximation of the asymptotic distribution of the power divergence tests with different approaches for calculating moments of the statistics; and (iii) assessing P values empirically, based on bootstrap samples. As a result, a significant association was observed for 3 markers. Furthermore, we used simulations to assess the validity of the normal approximation of the asymptotic distribution of the test statistics under the conditions of small and sparse samples.


Assuntos
Biometria , Raposas/genética , Técnicas Genéticas/estatística & dados numéricos , Alelos , Animais , Tamanho Corporal/genética , Mapeamento Cromossômico/estatística & dados numéricos , Intervalos de Confiança , Marcadores Genéticos/genética , Modelos Logísticos , Modelos Genéticos , Razão de Chances , Tamanho da Amostra
18.
PLoS One ; 13(11): e0206521, 2018.
Artigo em Inglês | MEDLINE | ID: mdl-30395579

RESUMO

BACKGROUND: The massive quantities of genetic data generated by high-throughput sequencing pose challenges to data storage, transmission and analyses. These problems are effectively solved through data compression, in which the size of data storage is reduced and the speed of data transmission is improved. Several options are available for compressing and storing genetic data. However, most of these options either do not provide sufficient compression rates or require a considerable length of time for decompression and loading. RESULTS: Here, we propose TRCMGene, a lossless genetic data compression method that uses a referential compression scheme. The novel concept of two-step compression method, which builds an index structure using K-means and k-nearest neighbours, is introduced to TRCMGene. Evaluation with several real datasets revealed that the compression factor of TRCMGene ranges from 9 to 21. TRCMGene presents a good balance between compression factor and reading time. On average, the reading time of compressed data is 60% of that of uncompressed data. Thus, TRCMGene not only saves disc space but also saves file access time and speeds up data loading. These effects collectively improve genetic data storage and transmission in the current hardware environment and render system upgrades unnecessary. TRCMGene, user manual and demos could be accessed freely from https://github.com/tangyou79/TRCM. The data mentioned in this manuscript could be downloaded from: https://github.com/tangyou79/TRCM/wiki.


Assuntos
Compressão de Dados/métodos , Técnicas Genéticas/estatística & dados numéricos , Software , Algoritmos , Animais , Arabidopsis/genética , Análise por Conglomerados , Compressão de Dados/estatística & dados numéricos , Bases de Dados Genéticas/estatística & dados numéricos , Sequenciamento de Nucleotídeos em Larga Escala/estatística & dados numéricos , Camundongos , Análise de Sequência de DNA/estatística & dados numéricos , Zea mays/genética
19.
BMC Genomics ; 8: 106, 2007 Apr 19.
Artigo em Inglês | MEDLINE | ID: mdl-17445269

RESUMO

BACKGROUND: In the few years since its discovery, RNAi has turned into a very powerful tool for the study of gene function by allowing post-transcriptional gene silencing. The RNAi mechanism, which is based on the introduction of a double-stranded RNA (dsRNA) trigger whose sequence is similar to that of the targeted messenger RNA (mRNA), is subject to off-target cross-reaction. RESULTS: We use a novel strategy based on phenotypic analysis of paralogs and predict that, in Caenorhabditis elegans, off-target effects occur when an mRNA sequence shares more than 95% identity over 40 nucleotides with the dsRNA. Interestingly, our results suggest that the minimum length necessary of a high-similarity stretch between a dsRNA and its target in order to observe an efficient RNAi effect varies from 30 to 50 nucleotides rather than 22 nucleotides, which is the length of siRNAs in C. elegans. CONCLUSION: Our predictive methods would improve the design of dsRNA and ultimately the use of RNAi as a therapeutic tool upon experimental verification.


Assuntos
Pareamento Incorreto de Bases/genética , Biologia Computacional , Interferência de RNA , RNA Interferente Pequeno/química , Animais , Caenorhabditis elegans/genética , Proteínas de Caenorhabditis elegans/fisiologia , Técnicas Genéticas/estatística & dados numéricos , Modelos Químicos , Interferência de RNA/efeitos dos fármacos , RNA de Cadeia Dupla/química , RNA de Cadeia Dupla/metabolismo , RNA Interferente Pequeno/genética , Ribonuclease III , Sensibilidade e Especificidade
20.
Stat Appl Genet Mol Biol ; 5: Article28, 2006.
Artigo em Inglês | MEDLINE | ID: mdl-17402912

RESUMO

For situations where the number of tested hypotheses is increasingly large, the power to detect statistically significant multiple treatment effects decreases. As is the case with microarray technology, often researchers are interested in identifying differentially expressed genes for more than two types of cells or treatments. A two-step procedure is proposed for the purpose of increasing power to detect significant effects (i.e., to identify differentially expressed genes). Specifically, in the first step, the null hypothesis of equality across the mean expression levels for all treatments is tested for each gene. In the second step, only pairwise comparisons corresponding to the genes for which the treatment means are statistically different in the first step are tested. We propose an approach to estimate the overall FDR for both fixed rejection regions and fixed FDR significance levels. Also proposed is a procedure to find the FDR significance levels used in the first step and the second step such that the overall FDR can be controlled below a pre-specified FDR significance level. When compared via simulation the two-step approach has increased power over a one-step procedure, and controls the FDR at a desire significance level.


Assuntos
Técnicas Genéticas/estatística & dados numéricos , Análise por Pareamento , Algoritmos , Neoplasias da Mama/genética , Simulação por Computador , Feminino , Perfilação da Expressão Gênica/estatística & dados numéricos , Regulação da Expressão Gênica/efeitos dos fármacos , Genes BRCA1 , Genes BRCA2 , Humanos , Modelos Genéticos , Mutação , Análise de Sequência com Séries de Oligonucleotídeos/estatística & dados numéricos , Probabilidade
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA