Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 34
Filtrar
1.
Bioinformatics ; 37(15): 2081-2087, 2021 Aug 09.
Artículo en Inglés | MEDLINE | ID: mdl-33515232

RESUMEN

MOTIVATION: Unique marker sequences are highly sought after in molecular diagnostics. Nevertheless, there are only few programs available to search for marker sequences, compared to the many programs for similarity search. We therefore wrote the program Fur for Finding Unique genomic Regions. RESULTS: Fur takes as input a sample of target sequences and a sample of closely related neighbors. It returns the regions present in all targets and absent from all neighbors. The recently published program genmap can also be used for this purpose and we compared it to fur. When analyzing a sample of 33 genomes representing the major phylogroups of E.coli, fur was 40 times faster than genmap but used three times more memory. On the other hand, genmap yielded three times more markers, but they were less accurate when tested in silico on a sample of 237 E.coli genomes. We also designed phylogroup-specific PCR primers based on the markers proposed by genmap and fur, and tested them by analyzing their virtual amplicons in GenBank. Finally, we used fur to design primers specific to a Lactobacillus species, and found excellent sensitivity and specificity in vitro. AVAILABILITY AND IMPLEMENTATION: Fur sources and documentation are available from https://github.com/evolbioinf/fur. The compiled software is posted as a docker container at https://hub.docker.com/r/haubold/fox. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

2.
Bioinformatics ; 36(7): 2040-2046, 2020 04 01.
Artículo en Inglés | MEDLINE | ID: mdl-31790149

RESUMEN

MOTIVATION: Tracking disease outbreaks by whole-genome sequencing leads to the collection of large samples of closely related sequences. Five years ago, we published a method to accurately compute all pairwise distances for such samples by indexing each sequence. Since indexing is slow, we now ask whether it is possible to achieve similar accuracy when indexing only a single sequence. RESULTS: We have implemented this idea in the program phylonium and show that it is as accurate as its predecessor and roughly 100 times faster when applied to all 2678 Escherichia coli genomes contained in ENSEMBL. One of the best published programs for rapidly computing pairwise distances, mash, analyzes the same dataset four times faster but, with default settings, it is less accurate than phylonium. AVAILABILITY AND IMPLEMENTATION: Phylonium runs under the UNIX command line; its C++ sources and documentation are available from github.com/evolbioinf/phylonium. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Asunto(s)
Genómica , Programas Informáticos , Algoritmos , Genoma , Análisis de Secuencia de ADN
3.
Bioinformatics ; 35(11): 1813-1819, 2019 06 01.
Artículo en Inglés | MEDLINE | ID: mdl-30395202

RESUMEN

MOTIVATION: Unique sequence regions are associated with genetic function in vertebrate genomes. However, measuring uniqueness, or absence of long repeats, along a genome is conceptually and computationally difficult. Here we use a variant of the Lempel-Ziv complexity, the match complexity, Cm, and augment it by deriving its null distribution for random sequences. We then apply Cm to the human and mouse genomes to investigate the relationship between sequence complexity and function. RESULTS: We implemented Cm in the program macle and show through simulation that the newly derived null distribution of Cm is accurate. This allows us to delineate high-complexity regions in the human and mouse genomes. Using our program macle2go, we find that these regions are twofold enriched for genes. Moreover, the genes contained in these regions are more than 10-fold enriched for developmental functions. AVAILABILITY AND IMPLEMENTATION: Source code for macle and macle2go is available from www.github.com/evolbioinf/macle and www.github.com/evolbioinf/macle2go, respectively; Cm browser tracks from guanine.evolbio.mgp.de/complexity. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Asunto(s)
Genoma , Genómica , Animales , Genes del Desarrollo , Humanos , Mamíferos , Ratones , Programas Informáticos
4.
Bioinformatics ; 32(16): 2554-5, 2016 08 15.
Artículo en Inglés | MEDLINE | ID: mdl-27153632

RESUMEN

MOTIVATION: In many organisms, including humans, recombination clusters within recombination hotspots. The standard method for de novo detection of recombinants at hotspots is sperm typing. This relies on allele-specific PCR at single nucleotide polymorphisms. Designing allele-specific primers by hand is time-consuming. We have therefore written a package to support hotspot detection and analysis. RESULTS: hotspot consists of four programs: asp looks up SNPs and designs allele-specific primers; aso constructs allele-specific oligos for mapping recombinants; xov implements a maximum-likelihood method for estimating the crossover rate; six, finally, simulates typing data. AVAILABILITY AND IMPLEMENTATION: hotspot is written in C. Sources are freely available under the GNU General Public License from http://github.com/evolbioinf/hotspot/ CONTACT: haubold@evolbio.mpg.de SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Asunto(s)
Recombinación Genética , Programas Informáticos , Espermatozoides , Alelos , Humanos , Funciones de Verosimilitud , Masculino
6.
Brief Bioinform ; 15(3): 407-18, 2014 May.
Artículo en Inglés | MEDLINE | ID: mdl-24291823

RESUMEN

Phylogenetics and population genetics are central disciplines in evolutionary biology. Both are based on comparative data, today usually DNA sequences. These have become so plentiful that alignment-free sequence comparison is of growing importance in the race between scientists and sequencing machines. In phylogenetics, efficient distance computation is the major contribution of alignment-free methods. A distance measure should reflect the number of substitutions per site, which underlies classical alignment-based phylogeny reconstruction. Alignment-free distance measures are either based on word counts or on match lengths, and I apply examples of both approaches to simulated and real data to assess their accuracy and efficiency. While phylogeny reconstruction is based on the number of substitutions, in population genetics, the distribution of mutations along a sequence is also considered. This distribution can be explored by match lengths, thus opening the prospect of alignment-free population genomics.


Asunto(s)
Genética de Población/métodos , Filogenia , Análisis de Secuencia de ADN/métodos , Animales , Biología Computacional/métodos , Evolución Molecular , Genética de Población/estadística & datos numéricos , Genoma Mitocondrial , Humanos , Modelos Genéticos , Mutación , Recombinación Genética , Selección Genética , Alineación de Secuencia , Análisis de Secuencia de ADN/estadística & datos numéricos
7.
Bioinformatics ; 31(8): 1169-75, 2015 Apr 15.
Artículo en Inglés | MEDLINE | ID: mdl-25504847

RESUMEN

MOTIVATION: A standard approach to classifying sets of genomes is to calculate their pairwise distances. This is difficult for large samples. We have therefore developed an algorithm for rapidly computing the evolutionary distances between closely related genomes. RESULTS: Our distance measure is based on ungapped local alignments that we anchor through pairs of maximal unique matches of a minimum length. These exact matches can be looked up efficiently using enhanced suffix arrays and our implementation requires approximately only 1 s and 45 MB RAM/Mbase analysed. The pairing of matches distinguishes non-homologous from homologous regions leading to accurate distance estimation. We show this by analysing simulated data and genome samples ranging from 29 Escherichia coli/Shigella genomes to 3085 genomes of Streptococcus pneumoniae. AVAILABILITY AND IMPLEMENTATION: We have implemented the computation of anchor distances in the multithreaded UNIX command-line program andi for ANchor DIstances. C sources and documentation are posted at http://github.com/evolbioinf/andi/ CONTACT: haubold@evolbio.mpg.de SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Asunto(s)
Algoritmos , Evolución Biológica , Genoma , Genómica/métodos , Alineación de Secuencia/métodos , Análisis de Secuencia de ADN/métodos , Programas Informáticos , Animales , Bases de Datos Genéticas , Humanos , Filogenia
8.
PLoS Pathog ; 9(7): e1003503, 2013.
Artículo en Inglés | MEDLINE | ID: mdl-23935484

RESUMEN

The origins of crop diseases are linked to domestication of plants. Most crops were domesticated centuries--even millennia--ago, thus limiting opportunity to understand the concomitant emergence of disease. Kiwifruit (Actinidia spp.) is an exception: domestication began in the 1930s with outbreaks of canker disease caused by P. syringae pv. actinidiae (Psa) first recorded in the 1980s. Based on SNP analyses of two circularized and 34 draft genomes, we show that Psa is comprised of distinct clades exhibiting negligible within-clade diversity, consistent with disease arising by independent samplings from a source population. Three clades correspond to their geographical source of isolation; a fourth, encompassing the Psa-V lineage responsible for the 2008 outbreak, is now globally distributed. Psa has an overall clonal population structure, however, genomes carry a marked signature of within-pathovar recombination. SNP analysis of Psa-V reveals hundreds of polymorphisms; however, most reside within PPHGI-1-like conjugative elements whose evolution is unlinked to the core genome. Removal of SNPs due to recombination yields an uninformative (star-like) phylogeny consistent with diversification of Psa-V from a single clone within the last ten years. Growth assays provide evidence of cultivar specificity, with rapid systemic movement of Psa-V in Actinidia chinensis. Genomic comparisons show a dynamic genome with evidence of positive selection on type III effectors and other candidate virulence genes. Each clade has highly varied complements of accessory genes encoding effectors and toxins with evidence of gain and loss via multiple genetic routes. Genes with orthologs in vascular pathogens were found exclusively within Psa-V. Our analyses capture a pathogen in the early stages of emergence from a predicted source population associated with wild Actinidia species. In addition to candidate genes as targets for resistance breeding programs, our findings highlight the importance of the source population as a reservoir of new disease.


Asunto(s)
Actinidia/microbiología , Proteínas Bacterianas/genética , Genoma Bacteriano , Enfermedades de las Plantas/microbiología , Pseudomonas syringae/genética , Actinidia/crecimiento & desarrollo , Proteínas Bacterianas/química , Proteínas Bacterianas/metabolismo , Productos Agrícolas/crecimiento & desarrollo , Productos Agrícolas/microbiología , Frutas/crecimiento & desarrollo , Frutas/microbiología , Islas Genómicas , Italia , Japón , Nueva Zelanda , Filogenia , Enfermedades de las Plantas/etiología , Brotes de la Planta/crecimiento & desarrollo , Brotes de la Planta/microbiología , Polimorfismo de Nucleótido Simple , Pseudomonas syringae/crecimiento & desarrollo , Pseudomonas syringae/aislamiento & purificación , Pseudomonas syringae/patogenicidad , Recombinación Genética , República de Corea , Especificidad de la Especie , Virulencia
9.
Bioinformatics ; 29(24): 3121-7, 2013 Dec 15.
Artículo en Inglés | MEDLINE | ID: mdl-24064419

RESUMEN

MOTIVATION: Why recombination? is one of the central questions in biology. This has led to a host of methods for quantifying recombination from sequence data. These methods are usually based on aligned DNA sequences. Here, we propose an efficient alignment-free alternative. RESULTS: Our method is based on the distribution of match lengths, which we look up using enhanced suffix arrays. By eliminating the alignment step, the test becomes fast enough for application to whole bacterial genomes. Using simulations we show that our test has similar power as established tests when applied to long pairs of sequences. When applied to 58 genomes of Escherichia coli, we pick up the strongest recombination signal from a 125 kb horizontal gene transfer engineered 20 years ago. AVAILABILITY AND IMPLEMENTATION: We have implemented our method in the command-line program rush. Its C sources and documentation are available under the GNU General Public License from http://guanine.evolbio.mpg.de/rush/.


Asunto(s)
Algoritmos , Biología Computacional , Genoma Bacteriano , Recombinación Genética , Alineación de Secuencia/métodos , Simulación por Computador , Escherichia coli/genética , Filogenia
10.
Bioinform Adv ; 4(1): vbae113, 2024.
Artículo en Inglés | MEDLINE | ID: mdl-39132289

RESUMEN

Motivation: Markers for diagnostic polymerase chain reactions are routinely constructed by taking regions common to the genomes of a target organism and subtracting the regions found in the targets' closest relatives, their neighbors. This approach is implemented in the published package Fur, which originally required memory proportional to the number of nucleotides in the neighborhood. This does not scale well. Results: Here, we describe a new version of Fur that only requires memory proportional to the longest neighbor. In spite of its greater memory efficiency, the new Fur remains fast and is accurate. We demonstrate this by applying it to simulated sequences and comparing it to an efficient alternative. Then we use the new Fur to extract markers from 120 reference bacteria. To make this feasible, we also introduce software for automatically finding target and neighbor genomes and for assessing markers. We pick the best primers from the 10 most sequenced reference bacteria and show their excellent in silico sensitivity and specificity. Availability and implementation: Fur is available from github.com/evolbioinf/fur, in the Docker image hub.docker.com/r/beatrizvm/mapro, and in the Code Ocean capsule 10.24433/CO.7955947.v1.

11.
Bioinformatics ; 27(11): 1466-72, 2011 Jun 01.
Artículo en Inglés | MEDLINE | ID: mdl-21471011

RESUMEN

MOTIVATION: Bacterial and viral genomes are often affected by horizontal gene transfer observable as abrupt switching in local homology. In addition to the resulting mosaic genome structure, they frequently contain regions not found in close relatives, which may play a role in virulence mechanisms. Due to this connection to medical microbiology, there are numerous methods available to detect horizontal gene transfer. However, these are usually aimed at individual genes and viral genomes rather than the much larger bacterial genomes. Here, we propose an efficient alignment-free approach to describe the mosaic structure of viral and bacterial genomes, including their unique regions. RESULTS: Our method is based on the lengths of exact matches between pairs of sequences. Long matches indicate close homology, short matches more distant homology or none at all. These exact match lengths can be looked up efficiently using an enhanced suffix array. Our program implementing this approach, alfy (ALignment-Free local homologY), efficiently and accurately detects the recombination break points in simulated DNA sequences and among recombinant HIV-1 strains. We also apply alfy to Escherichia coli genomes where we detect new evidence for the hypothesis that strains pathogenic in poultry can infect humans. AVAILABILITY: alfy is written in standard C and its source code is available under the GNU General Public License from http://guanine.evolbio.mpg.de/alfy/. The software package also includes documentation and example data.


Asunto(s)
Genoma Bacteriano , Genoma Viral , Análisis de Secuencia de ADN , Homología de Secuencia de Ácido Nucleico , Escherichia coli/genética , Transferencia de Gen Horizontal , Genómica/métodos , VIH-1/genética , Humanos , Programas Informáticos
12.
Bioinformatics ; 27(4): 449-55, 2011 Feb 15.
Artículo en Inglés | MEDLINE | ID: mdl-21156730

RESUMEN

MOTIVATION: Sequencing capacity is currently growing more rapidly than CPU speed, leading to an analysis bottleneck in many genome projects. Alignment-free sequence analysis methods tend to be more efficient than their alignment-based counterparts. They may, therefore, be important in the long run for keeping sequence analysis abreast with sequencing. RESULTS: We derive and implement an alignment-free estimator of the number of pairwise mismatches, . Our implementation of , pim, is based on an enhanced suffix array and inherits the superior time and memory efficiency of this data structure. Simulations demonstrate that is accurate if mutations are distributed randomly along the chromosome. While real data often deviates from this ideal, remains useful for identifying regions of low genetic diversity using a sliding window approach. We demonstrate this by applying it to the complete genomes of 37 strains of Drosophila melanogaster, and to the genomes of two closely related Drosophila species, D.simulans and D.sechellia. In both cases, we detect the diversity minimum and discuss its biological implications.


Asunto(s)
Biología Computacional/métodos , Variación Genética , Análisis de Secuencia de ADN/métodos , Algoritmos , Animales , Simulación por Computador , Drosophila/genética , Genoma de los Insectos , Recombinación Genética
13.
Methods Mol Biol ; 2242: 77-89, 2021.
Artículo en Inglés | MEDLINE | ID: mdl-33961219

RESUMEN

By tracking pathogen outbreaks using whole genome sequencing, medical microbiology is currently being transformed into genomic epidemiology. This change in technology is leading to the rapid accumulation of large samples of closely related genome sequences. Summarizing such samples into phylogenies can be computationally challenging. Our program andi quickly computes accurate pairwise distances between up to thousands of bacterial genomes. Working under the UNIX command line, we show how andi can be used to transform genomes to phylogenies with support values ready to be printed or integrated into documents.


Asunto(s)
ADN Bacteriano/genética , Escherichia coli/genética , Genoma Bacteriano , Genómica , Filogenia , Shigella/genética , Bases de Datos Genéticas , Proyectos de Investigación , Diseño de Software , Flujo de Trabajo
14.
Oncotarget ; 12(10): 1011-1023, 2021 May 11.
Artículo en Inglés | MEDLINE | ID: mdl-34012513

RESUMEN

Non-invasive clinical diagnostics of bladder cancer is feasible via a set of chemically distinct molecules including macromolecular tumor markers such as polypeptides and nucleic acids. In terms of tumor-related aberrant gene expression, RNA transcripts are the primary indicator of tumor-specific gene expression as for polypeptides and their metabolic products occur subsequently. Thus, in case of bladder cancer, urine RNA represents an early potentially useful diagnostic marker. Here we describe a systematic deep transcriptome analysis of representative pools of urine RNA collected from healthy donors versus bladder cancer patients according to established SOPs. This analysis revealed RNA marker candidates reflecting coding sequences, non-coding sequences, and circular RNAs. Next, we designed and validated PCR amplicons for a set of novel marker candidates and tested them in human bladder cancer cell lines. We identified linear and circular transcripts of the S100 Calcium Binding Protein 6 (S100A6) and translocation associated membrane protein 1 (TRAM1) as highly promising potential tumor markers. This work strongly suggests exploiting urine RNAs as diagnostic markers of bladder cancer and it suggests specific novel markers. Further, this study describes an entry into the tumor-biology of bladder cancer and the development of gene-targeted therapeutic drugs.

15.
Bioinformatics ; 25(24): 3221-7, 2009 Dec 15.
Artículo en Inglés | MEDLINE | ID: mdl-19825795

RESUMEN

MOTIVATION: Genome comparison is central to contemporary genomics and typically relies on sequence alignment. However, genome-wide alignments are difficult to compute. We have, therefore, recently developed an accurate alignment-free estimator of the number of substitutions per site based on the lengths of exact matches between pairs of sequences. The previous implementation of this measure requires n(n-1) suffix tree constructions and traversals, where n is the number of sequences analyzed. This does not scale well for large n. RESULTS: We present an algorithm to extract pairwise distances in a single traversal of a single suffix tree containing n sequences. As a result, the run time of the suffix tree construction phase of our algorithm is reduced from O(n(2)L) to O(nL), where L is the length of each sequence. We implement this algorithm in the program kr version 2 and apply it to 825 HIV genomes, 13 genomes of enterobacteria and the complete genomes of 12 Drosophila species. We show that, depending on the input dataset, the new program is at least 10 times faster than its predecessor. AVAILABILITY: Version 2 of kr can be tested via a web interface at http://guanine.evolbio.mpg.de/kr2/. It is written in standard C and its source code is available under the GNU General Public License from the same web site. CONTACT: haubold@evolbio.mpg.de Supplementary informations: Supplementary data are available at Bioinformatics online.


Asunto(s)
Algoritmos , Genoma , Genómica/métodos , Animales , Secuencia de Bases , Bases de Datos Genéticas , Humanos , Alineación de Secuencia , Análisis de Secuencia de ADN
16.
Genetics ; 182(1): 205-16, 2009 May.
Artículo en Inglés | MEDLINE | ID: mdl-19237689

RESUMEN

Using coalescent simulations, we study the impact of three different sampling schemes on patterns of neutral diversity in structured populations. Specifically, we are interested in two summary statistics based on the site frequency spectrum as a function of migration rate, demographic history of the entire substructured population (including timing and magnitude of specieswide expansions), and the sampling scheme. Using simulations implementing both finite-island and two-dimensional stepping-stone spatial structure, we demonstrate strong effects of the sampling scheme on Tajima's D (D(T)) and Fu and Li's D (D(FL)) statistics, particularly under specieswide (range) expansions. Pooled samples yield average D(T) and D(FL) values that are generally intermediate between those of local and scattered samples. Local samples (and to a lesser extent, pooled samples) are influenced by local, rapid coalescence events in the underlying coalescent process. These processes result in lower proportions of external branch lengths and hence lower proportions of singletons, explaining our finding that the sampling scheme affects D(FL) more than it does D(T). Under specieswide expansion scenarios, these effects of spatial sampling may persist up to very high levels of gene flow (Nm > 25), implying that local samples cannot be regarded as being drawn from a panmictic population. Importantly, many data sets on humans, Drosophila, and plants contain signatures of specieswide expansions and effects of sampling scheme that are predicted by our simulation results. This suggests that validating the assumption of panmixia is crucial if robust demographic inferences are to be made from local or pooled samples. However, future studies should consider adopting a framework that explicitly accounts for the genealogical effects of population subdivision and empirical sampling schemes.


Asunto(s)
Drosophila melanogaster , Variación Genética , Genética de Población , Desequilibrio de Ligamiento , Modelos Genéticos , Solanum lycopersicum , Animales , Simulación por Computador , Demografía , Drosophila melanogaster/genética , Solanum lycopersicum/genética , Humanos
17.
Mol Ecol ; 19 Suppl 1: 277-84, 2010 Mar.
Artículo en Inglés | MEDLINE | ID: mdl-20331786

RESUMEN

Improvements in sequencing technology over the past 5 years are leading to routine application of shotgun sequencing in the fields of ecology and evolution. However, the theory to estimate evolutionary parameters from these data is still being worked out. Here we present an extension and implementation of part of this theory, mlRho. This program can efficiently compute the following three maximum likelihood estimators based on shotgun sequence data obtained from single diploid individuals: the population mutation rate (4N(e)mu), the sequencing error rate, and the population recombination rate (4N(e)c). We demonstrate the accuracy of mlRho by applying it to simulated data sets. In addition, we analyse the genomes of the sea squirt Ciona intestinalis and the water flea Daphnia pulex. Ciona intestinalis is an obligate outcrosser, while D. pulex is a cyclic parthenogen, and we discuss how these contrasting life histories are reflected in our parameter estimates. The program mlRho is freely available from http://guanine.evolbio.mpg.de/mlRho.


Asunto(s)
Análisis Mutacional de ADN/métodos , Genética de Población , Genómica/métodos , Recombinación Genética , Programas Informáticos , Animales , Ciona intestinalis/genética , Biología Computacional/métodos , Simulación por Computador , Daphnia/genética , Diploidia , Genoma , Funciones de Verosimilitud , Modelos Genéticos
18.
Mol Ecol ; 19 Suppl 1: 162-75, 2010 Mar.
Artículo en Inglés | MEDLINE | ID: mdl-20331778

RESUMEN

Recent advances in sequencing technology promise to provide new strategies for studying population differentiation and speciation phenomena in their earliest phases. We focus here on the black carrion crow (Corvus [corone] corone), which forms a zone of hybridization and overlap with the grey coated hooded crow (Corvus [corone] cornix). However, although these semispecies are taxonomically distinct, previous analyses based on several types of genetic markers did not reveal significant molecular differentiation between them. We here corroborate this result with sequence data obtained from a set of 25 nuclear intronic loci. Thus, the system represents a case of a very early phase of species divergence that requires new molecular approaches for its description. We have therefore generated RNAseq expression profiles using barcoded massively parallel pyrosequencing of brain mRNA from six individuals of the carrion crow and five individuals from a hybrid zone with the hooded crow. We obtained 856 675 reads from two runs, with average read length of 270 nt and coverage of 8.44. Reads were assembled de novo into 19 552 contigs, 70% of which could be assigned to annotated genes in chicken and zebra finch. This resulted in a total of 7637 orthologous genes and a core set of 1301 genes that could be compared across all individuals. We find a clear clustering of expression profiles for the pure carrion crow animals and disperse profiles for the animals from the hybrid zone. These results suggest that gene expression differences may indeed be a sensitive indicator of initial species divergence.


Asunto(s)
Cuervos/genética , Perfilación de la Expresión Génica , Hibridación Genética , Animales , Análisis por Conglomerados , Hibridación Genómica Comparativa , Etiquetas de Secuencia Expresada , Expresión Génica , Proyectos Piloto , Análisis de Secuencia de ADN/métodos
19.
G3 (Bethesda) ; 10(1): 211-223, 2020 01 07.
Artículo en Inglés | MEDLINE | ID: mdl-31699776

RESUMEN

With up to millions of nearly neutral polymorphisms now being routinely sampled in population-genomic surveys, it is possible to estimate the site-frequency spectrum of such sites with high precision. Each frequency class reflects a mixture of potentially unique demographic histories, which can be revealed using theory for the probability distributions of the starting and ending points of branch segments over all possible coalescence trees. Such distributions are completely independent of past population history, which only influences the segment lengths, providing the basis for estimating average population sizes separating tree-wide coalescence events. The history of population-size change experienced by a sample of polymorphisms can then be dissected in a model-flexible fashion, and extension of this theory allows estimation of the mean and full distribution of long-term effective population sizes and ages of alleles of specific frequencies. Here, we outline the basic theory underlying the conceptual approach, develop and test an efficient statistical procedure for parameter estimation, and apply this to multiple population-genomic datasets for the microcrustacean Daphnia pulex.


Asunto(s)
Biomasa , Modelos Genéticos , Polimorfismo de Nucleótido Simple , Animales , Daphnia/genética , Daphnia/crecimiento & desarrollo
20.
Mol Cell Biol ; 23(3): 864-72, 2003 Feb.
Artículo en Inglés | MEDLINE | ID: mdl-12529392

RESUMEN

Nuclear receptors are ligand-modulated transcription factors. On the basis of the completed human genome sequence, this family was thought to contain 48 functional members. However, by mining human and mouse genomic sequences, we identified FXRbeta as a novel family member. It is a functional receptor in mice, rats, rabbits, and dogs but constitutes a pseudogene in humans and primates. Murine FXRbeta is widely coexpressed with FXR in embryonic and adult tissues. It heterodimerizes with RXRalpha and stimulates transcription through specific DNA response elements upon addition of 9-cis-retinoic acid. Finally, we identified lanosterol as a candidate endogenous ligand that induces coactivator recruitment and transcriptional activation by mFXRbeta. Lanosterol is an intermediate of cholesterol biosynthesis, which suggests a direct role in the control of cholesterol biosynthesis in nonprimates. The identification of FXRbeta as a novel functional receptor in nonprimate animals sheds new light on the species differences in cholesterol metabolism and has strong implications for the interpretation of genetic and pharmacological studies of FXR-directed physiologies and drug discovery programs.


Asunto(s)
Proteínas de Unión al ADN/genética , Proteínas de Unión al ADN/metabolismo , Lanosterol/metabolismo , Receptores Citoplasmáticos y Nucleares/genética , Receptores Citoplasmáticos y Nucleares/metabolismo , Factores de Transcripción/genética , Factores de Transcripción/metabolismo , Secuencia de Aminoácidos , Animales , Secuencia de Bases , Colesterol/metabolismo , Clonación Molecular , ADN Complementario/genética , Proteínas de Unión al ADN/química , Dimerización , Perros , Humanos , Ligandos , Masculino , Ratones , Datos de Secuencia Molecular , Primates , Estructura Cuaternaria de Proteína , Seudogenes , Conejos , Ratas , Factores de Transcripción/química
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA