RESUMEN
Fungi are present in all environments. They fulfil important ecological functions and play a crucial role in the food industry. Their accurate characterization is thus indispensable, particularly through metabarcoding. The most frequently used markers to monitor fungi are ITSs. These markers are the best documented in public databases but have one main weakness: polymerase chain reaction amplification may produce non-overlapping reads in a significant fraction of the fungi. When these reads are filtered out, traditional metabarcoding pipelines lose part of the information and consequently produce biased pictures of the composition and structure of the environment under study. We developed a solution that enables processing of the entire set of reads including both overlapping and non-overlapping, thus providing a more accurate picture of fungal communities. Our comparative tests using simulated and real data demonstrated the effectiveness of our solution, which can be used by both experts and non-specialists on a command line or through the Galaxy-based web interface.
Asunto(s)
Código de Barras del ADN Taxonómico/métodos , ADN Espaciador Ribosómico , Hongos/clasificación , Hongos/genética , Bases de Datos Genéticas , Hongos/metabolismo , Metagenómica/métodos , ARN Ribosómico 16S , Programas Informáticos , Interfaz Usuario-Computador , Navegador Web , Flujo de TrabajoRESUMEN
Assessing the evolutionary potential of animal populations in the wild is crucial to understanding how they may respond to selection mediated by rapid environmental change (e.g. habitat loss and fragmentation). A growing number of studies have investigated the adaptive role of behaviour, but assessments of its genetic basis in a natural setting remain scarce. We combined intensive biologging technology with genome-wide data and a pedigree-free quantitative genetic approach to quantify repeatability, heritability and evolvability for a suite of behaviours related to the risk avoidance-resource acquisition trade-off in a wild roe deer (Capreolus capreolus) population inhabiting a heterogeneous, human-dominated landscape. These traits, linked to the stress response, movement and space-use behaviour, were all moderately to highly repeatable. Furthermore, the repeatable among-individual component of variation in these traits was partly due to additive genetic variance, with heritability estimates ranging from 0.21 ± 0.08 to 0.70 ± 0.11 and evolvability ranging from 1.1% to 4.3%. Changes in the trait mean can therefore occur under hypothetical directional selection over just a few generations. To the best of our knowledge, this is the first empirical demonstration of additive genetic variation in space-use behaviour in a free-ranging population based on genomic relatedness data. We conclude that wild animal populations may have the potential to adjust their spatial behaviour to human-driven environmental modifications through microevolutionary change.
Asunto(s)
Conducta Animal , Ciervos/genética , Carácter Cuantitativo Heredable , Conducta Espacial , Animales , Femenino , MasculinoRESUMEN
Motivation: Metagenomics leads to major advances in microbial ecology and biologists need user friendly tools to analyze their data on their own. Results: This Galaxy-supported pipeline, called FROGS, is designed to analyze large sets of amplicon sequences and produce abundance tables of Operational Taxonomic Units (OTUs) and their taxonomic affiliation. The clustering uses Swarm. The chimera removal uses VSEARCH, combined with original cross-sample validation. The taxonomic affiliation returns an innovative multi-affiliation output to highlight databases conflicts and uncertainties. Statistical results and numerous graphical illustrations are produced along the way to monitor the pipeline. FROGS was tested for the detection and quantification of OTUs on real and in silico datasets and proved to be rapid, robust and highly sensitive. It compares favorably with the widespread mothur, UPARSE and QIIME. Availability and implementation: Source code and instructions for installation: https://github.com/geraldinepascal/FROGS.git. A companion website: http://frogs.toulouse.inra.fr. Contact: geraldine.pascal@inra.fr. Supplementary information: Supplementary data are available at Bioinformatics online.
Asunto(s)
Metagenómica/métodos , Programas Informáticos , Bacterias/genética , Análisis por ConglomeradosRESUMEN
After publication of this work [1], we noted that there was an error in Table 3 Line 4.
RESUMEN
BACKGROUND: Bacterial cold-water disease, which is caused by Flavobacterium psychrophilum, is one of the major diseases that affect rainbow trout (Oncorhynchus mykiss) and a primary concern for trout farming. Better knowledge of the genetic basis of resistance to F. psychrophilum would help to implement this trait in selection schemes and to investigate the immune mechanisms associated with resistance. Various studies have revealed that skin and mucus may contribute to response to infection. However, previous quantitative trait loci (QTL) studies were conducted by using injection as the route of infection. Immersion challenge, which is assumed to mimic natural infection by F. psychrophilum more closely, may reveal different defence mechanisms. RESULTS: Two isogenic lines of rainbow trout with contrasting susceptibilities to F. psychrophilum were crossed to produce doubled haploid F2 progeny. Fish were infected with F. psychrophilum either by intramuscular injection (115 individuals) or by immersion (195 individuals), and genotyped for 9654 markers using RAD-sequencing. Fifteen QTL associated with resistance traits were detected and only three QTL were common between the injection and immersion. Using a model that accounted for epistatic interactions between QTL, two main types of interactions were revealed. A "compensation-like" effect was detected between several pairs of QTL for the two modes of infection. An "enhancing-like" interaction effect was detected between four pairs of QTL. Integration of the QTL results with results of a previous transcriptomic analysis of response to F. psychrophilum infection resulted in a list of potential candidate immune genes that belong to four relevant functional categories (bacterial sensors, effectors of antibacterial immunity, inflammatory factors and interferon-stimulated genes). CONCLUSIONS: These results provide new insights into the genetic determinism of rainbow trout resistance to F. psychrophilum and confirm that some QTL with large effects are involved in this trait. For the first time, the role of epistatic interactions between resistance-associated QTL was evidenced. We found that the infection protocol used had an effect on the modulation of defence mechanisms and also identified relevant immune functional candidate genes.
Asunto(s)
Enfermedades de los Peces/genética , Enfermedades de los Peces/inmunología , Infecciones por Flavobacteriaceae/veterinaria , Flavobacterium/fisiología , Oncorhynchus mykiss , Sitios de Carácter Cuantitativo , Animales , Resistencia a la Enfermedad , Femenino , Enfermedades de los Peces/microbiología , Infecciones por Flavobacteriaceae/genética , Infecciones por Flavobacteriaceae/inmunología , Genotipo , Masculino , Fenotipo , Polimorfismo de Nucleótido SimpleRESUMEN
Bananas (Musa spp.), including dessert and cooking types, are giant perennial monocotyledonous herbs of the order Zingiberales, a sister group to the well-studied Poales, which include cereals. Bananas are vital for food security in many tropical and subtropical countries and the most popular fruit in industrialized countries. The Musa domestication process started some 7,000 years ago in Southeast Asia. It involved hybridizations between diverse species and subspecies, fostered by human migrations, and selection of diploid and triploid seedless, parthenocarpic hybrids thereafter widely dispersed by vegetative propagation. Half of the current production relies on somaclones derived from a single triploid genotype (Cavendish). Pests and diseases have gradually become adapted, representing an imminent danger for global banana production. Here we describe the draft sequence of the 523-megabase genome of a Musa acuminata doubled-haploid genotype, providing a crucial stepping-stone for genetic improvement of banana. We detected three rounds of whole-genome duplications in the Musa lineage, independently of those previously described in the Poales lineage and the one we detected in the Arecales lineage. This first monocotyledon high-continuity whole-genome sequence reported outside Poales represents an essential bridge for comparative genome analysis in plants. As such, it clarifies commelinid-monocotyledon phylogenetic relationships, reveals Poaceae-specific features and has led to the discovery of conserved non-coding sequences predating monocotyledon-eudicotyledon divergence.
Asunto(s)
Evolución Molecular , Genoma de Planta/genética , Musa/genética , Secuencia Conservada/genética , Elementos Transponibles de ADN/genética , Duplicación de Gen/genética , Genes de Plantas/genética , Genotipo , Haploidia , Datos de Secuencia Molecular , Musa/clasificación , FilogeniaRESUMEN
The gut microbiota is known to play an important role in energy harvest and is likely to affect feed efficiency. In this study, we used 16S metabarcoding sequencing to analyse the caecal microbiota of laying hens from feed-efficient and non-efficient lines obtained by divergent selection for residual feed intake. The two lines were fed either a commercial wheat-soybean based diet (CTR) or a low-energy, high-fibre corn-sunflower diet (LE). The analysis revealed a significant line x diet interaction, highlighting distinct differences in microbial community composition between the two lines when hens were fed the CTR diet, and more muted differences when hens were fed the LE diet. Our results are consistent with the hypothesis that a richer and more diverse microbiota may play a role in enhancing feed efficiency, albeit in a diet-dependent manner. The taxonomic differences observed in the microbial composition seem to correlate with alterations in starch and fibre digestion as well as in the production of short-chain fatty acids. As a result, we hypothesise that efficient hens are able to optimise nutrient absorption through the activity of fibrolytic bacteria such as Alistipes or Anaerosporobacter, which, via their production of propionate, influence various aspects of host metabolism.
Asunto(s)
Pollos , Microbioma Gastrointestinal , Animales , Femenino , Pollos/metabolismo , Alimentación Animal/análisis , Dieta/veterinaria , Ingestión de Alimentos , Fenómenos Fisiológicos Nutricionales de los AnimalesRESUMEN
UNLABELLED: We developed a modular and scalable framework called Eoulsan, based on the Hadoop implementation of the MapReduce algorithm dedicated to high-throughput sequencing data analysis. Eoulsan allows users to easily set up a cloud computing cluster and automate the analysis of several samples at once using various software solutions available. Our tests with Amazon Web Services demonstrated that the computation cost is linear with the number of instances booked as is the running time with the increasing amounts of data. AVAILABILITY AND IMPLEMENTATION: Eoulsan is implemented in Java, supported on Linux systems and distributed under the LGPL License at: http://transcriptome.ens.fr/eoulsan/
Asunto(s)
Algoritmos , Biología Computacional/métodos , Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Análisis de Secuencia de ARN/métodos , Animales , Ratones , Programas InformáticosRESUMEN
Even if Archaea deliver important ecosystem services and are major players in global biogeochemical cycles, they remain poorly understood in freshwater ecosystems. To our knowledge, no studies specifically address the direct impact of xenobiotics on the riverine archaeome. Using environmental DNA metabarcoding of the 16S ribosomal gene, we previously demonstrated bacterial communities significant shifts linked to pollutant mixtures during an extreme flood in a typical Mediterranean coastal watercourse. Here, using the same methodology, we sought to determine whether archaeal community shifts coincided with the delivery of environmental stressors during the same flood. Further, we wanted to determine how archaea taxa compared at different seasons. In contrast to the bacteriome, the archaeome showed a specific community in summer compared to winter and autumn. We also identified a significant relationship between in situ archaeome shifts and changes in physicochemical parameters along the flood, but a less marked link to those parameters correlated to river hydrodynamics than bacteria. New urban-specific archaeal taxa significantly related to multiple stressors were identified. Through statistical modeling of both domains, our results demonstrate that Archaea, seldom considered as bioindicators of water quality, have the potential to improve monitoring methods of watersheds.
Asunto(s)
Archaea , Ecosistema , Archaea/genética , Estaciones del Año , ARN Ribosómico 16S/genética , Bacterias/genética , Ríos/microbiologíaRESUMEN
Environmental DNA (eDNA) metabarcoding has gained growing attention as a strategy for monitoring biodiversity in ecology. However, taxa identifications produced through metabarcoding require sophisticated processing of high-throughput sequencing data from taxonomically informative DNA barcodes. Various sets of universal and taxon-specific primers have been developed, extending the usability of metabarcoding across archaea, bacteria and eukaryotes. Accordingly, a multitude of metabarcoding data analysis tools and pipelines have also been developed. Often, several developed workflows are designed to process the same amplicon sequencing data, making it somewhat puzzling to choose one among the plethora of existing pipelines. However, each pipeline has its own specific philosophy, strengths and limitations, which should be considered depending on the aims of any specific study, as well as the bioinformatics expertise of the user. In this review, we outline the input data requirements, supported operating systems and particular attributes of thirty-two amplicon processing pipelines with the goal of helping users to select a pipeline for their metabarcoding projects.
RESUMEN
BACKGROUND: Drug susceptible clinical isolates of Candida albicans frequently become highly tolerant to drugs during chemotherapy, with dreadful consequences to patient health. We used RNA sequencing (RNA-seq) to analyze the transcriptomes of a CDR (Candida Drug Resistance) strain and its isogenic drug sensitive counterpart. RESULTS: RNA-seq unveiled differential expression of 228 genes including a) genes previously identified as involved in CDR, b) genes not previously associated to the CDR phenotype, and c) novel transcripts whose function as a gene is uncharacterized. In particular, we show for the first time that CDR acquisition is correlated with an overexpression of the transcription factor encoding gene CZF1. CZF1 null mutants were susceptible to many drugs, independently of known multidrug resistance mechanisms. We show that CZF1 acts as a repressor of ß-glucan synthesis, thus negatively regulating cell wall integrity. Finally, our RNA-seq data allowed us to identify a new transcribed region, upstream of the TAC1 gene, which encodes the major CDR transcriptional regulator. CONCLUSION: Our results open new perspectives of the role of Czf1 and of our understanding of the transcriptional and post-transcriptional mechanisms that lead to the acquisition of drug resistance in C. albicans, with potential for future improvements of therapeutic strategies.
Asunto(s)
Candida albicans/genética , Farmacorresistencia Fúngica , Proteínas Fúngicas/genética , Proteínas Fúngicas/metabolismo , Perfilación de la Expresión Génica , Análisis de Secuencia de ARN , Factores de Transcripción/genética , Factores de Transcripción/metabolismo , beta-Glucanos/metabolismoRESUMEN
Single nucleotide polymorphism (SNP) arrays, also named « SNP chips ¼, enable very large numbers of individuals to be genotyped at a targeted set of thousands of genome-wide identified markers. We used preexisting variant datasets from USDA, a French commercial line and 30X-coverage whole genome sequencing of INRAE isogenic lines to develop an Affymetrix 665 K SNP array (HD chip) for rainbow trout. In total, we identified 32,372,492 SNPs that were polymorphic in the USDA or INRAE databases. A subset of identified SNPs were selected for inclusion on the chip, prioritizing SNPs whose flanking sequence uniquely aligned to the Swanson reference genome, with homogenous repartition over the genome and the highest Minimum Allele Frequency in both USDA and French databases. Of the 664,531 SNPs which passed the Affymetrix quality filters and were manufactured on the HD chip, 65.3% and 60.9% passed filtering metrics and were polymorphic in two other distinct French commercial populations in which, respectively, 288 and 175 sampled fish were genotyped. Only 576,118 SNPs mapped uniquely on both Swanson and Arlee reference genomes, and 12,071 SNPs did not map at all on the Arlee reference genome. Among those 576,118 SNPs, 38,948 SNPs were kept from the commercially available medium-density 57 K SNP chip. We demonstrate the utility of the HD chip by describing the high rates of linkage disequilibrium at 2-10 kb in the rainbow trout genome in comparison to the linkage disequilibrium observed at 50-100 kb which are usual distances between markers of the medium-density chip.
RESUMEN
Most single-nucleotide polymorphisms (SNPs) are located in non-coding regions, but the fraction usually studied is harbored in protein-coding regions because potential impacts on proteins are relatively easy to predict by popular tools such as the Variant Effect Predictor. These tools annotate variants independently without considering the potential effect of grouped or haplotypic variations, often called "multi-nucleotide variants" (MNVs). Here, we used a large RNA-seq dataset to survey MNVs, comprising 382 chicken samples originating from 11 populations analyzed in the companion paper in which 9.5M SNPs- including 3.3M SNPs with reliable genotypes-were detected. We focused our study on in-codon MNVs and evaluate their potential mis-annotation. Using GATK HaplotypeCaller read-based phasing results, we identified 2,965 MNVs observed in at least five individuals located in 1,792 genes. We found 41.1% of them showing a novel impact when compared to the effect of their constituent SNPs analyzed separately. The biggest impact variation flux concerns the originally annotated stop-gained consequences, for which around 95% were rescued; this flux is followed by the missense consequences for which 37% were reannotated with a different amino acid. We then present in more depth the rescued stop-gained MNVs and give an illustration in the SLC27A4 gene. As previously shown in human datasets, our results in chicken demonstrate the value of haplotype-aware variant annotation, and the interest to consider MNVs in the coding region, particularly when searching for severe functional consequence such as stop-gained variants.
RESUMEN
This study describes the associations between fecal microbiota and vaccine response variability in pigs, using 98 piglets vaccinated against the influenza A virus at 28 days of age (D28) with a booster at D49. Immune response to the vaccine is measured at D49, D56, D63, and D146 by serum levels of IAV-specific IgG and assays of hemagglutination inhibition (HAI). Analysis of the pre-vaccination microbiota characterized by 16S rRNA gene sequencing of fecal DNA reveals a higher vaccine response in piglets with a richer microbiota, and shows that 23 operational taxonomic units (OTUs) are differentially abundant between high and low IAV-specific IgG producers at D63. A stronger immune response is linked with OTUs assigned to the genus Prevotella and family Muribaculaceae, and a weaker response is linked with OTUs assigned to the genera Helicobacter and Escherichia-Shigella. A set of 81 OTUs accurately predicts IAV-specific IgG and HAI titer levels at all time points, highlighting early and late associations between pre-vaccination fecal microbiota composition and immune response to the vaccine.
RESUMEN
Phenotypic plasticity is a key component of the ability of organisms to respond to changing environmental conditions. In this study, we aimed to study the establishment of DNA methylation marks in response to an environmental stress in rainbow trout and to assess whether these marks depend on the genetic background. The environmental stress chosen here was temperature, a known induction factor of epigenetic marks in fish. To disentangle the role of epigenetic mechanisms such as DNA methylation in generating phenotypic variations, nine rainbow trout isogenic lines with no genetic variability within a line were used. For each line, half of the eggs were incubated at standard temperature (11°C) and the other half at high temperature (16°C), from eyed-stage to hatching. In order to gain a first insight into the establishment of DNA methylation marks in response to an early temperature regime (control 11°C vs. heated 16°C), we have studied the expression of 8 dnmt3 (DNA methyltransferase) genes, potentially involved in de novo methylation, and analysed global DNA methylation in the different rainbow trout isogenic lines using LUMA (LUminometric Methylation Assay). Finally, finer investigation of genome-wide methylation patterns was performed using EpiRADseq, a reduced-representation library approach based on the ddRADseq (Double Digest Restriction Associated DNA) protocol, for six rainbow trout isogenic lines. We have demonstrated that thermal history during embryonic development alters patterns of DNA methylation, but to a greater or lesser extent depending on the genetic background.
Asunto(s)
Oncorhynchus mykiss , Animales , Metilación de ADN , Desarrollo Embrionario , Antecedentes Genéticos , TemperaturaRESUMEN
In addition to their common usages to study gene expression, RNA-seq data accumulated over the last 10 years are a yet-unexploited resource of SNPs in numerous individuals from different populations. SNP detection by RNA-seq is particularly interesting for livestock species since whole genome sequencing is expensive and exome sequencing tools are unavailable. These SNPs detected in expressed regions can be used to characterize variants affecting protein functions, and to study cis-regulated genes by analyzing allele-specific expression (ASE) in the tissue of interest. However, gene expression can be highly variable, and filters for SNP detection using the popular GATK toolkit are not yet standardized, making SNP detection and genotype calling by RNA-seq a challenging endeavor. We compared SNP calling results using GATK suggested filters, on two chicken populations for which both RNA-seq and DNA-seq data were available for the same samples of the same tissue. We showed, in expressed regions, a RNA-seq precision of 91% (SNPs detected by RNA-seq and shared by DNA-seq) and we characterized the remaining 9% of SNPs. We then studied the genotype (GT) obtained by RNA-seq and the impact of two factors (GT call-rate and read number per GT) on the concordance of GT with DNA-seq; we proposed thresholds for them leading to a 95% concordance. Applying these thresholds to 767 multi-tissue RNA-seq of 382 birds of 11 chicken populations, we found 9.5 M SNPs in total, of which â¼550,000 SNPs per tissue and population with a reliable GT (call rate ≥ 50%) and among them, â¼340,000 with a MAF ≥ 10%. We showed that such RNA-seq data from one tissue can be used to (i) detect SNPs with a strong predicted impact on proteins, despite their scarcity in each population (16,307 SIFT deleterious missenses and 590 stop-gained), (ii) study, on a large scale, cis-regulations of gene expression, with â¼81% of protein-coding and 68% of long non-coding genes (TPM ≥ 1) that can be analyzed for ASE, and with â¼29% of them that were cis-regulated, and (iii) analyze population genetic using such SNPs located in expressed regions. This work shows that RNA-seq data can be used with good confidence to detect SNPs and associated GT within various populations and used them for different analyses as GTEx studies.
RESUMEN
Rivers are representative of the overall contamination found in their catchment area. Contaminant concentrations in watercourses depend on numerous factors including land use and rainfall events. Globally, in Mediterranean regions, rainstorms are at the origin of fluvial multipollution phenomena as a result of Combined Sewer Overflows (CSOs) and floods. Large loads of urban-associated microorganisms, including faecal bacteria, are released from CSOs which place public health - as well as ecosystems - at risk. The impacts of freshwater contamination on river ecosystems have not yet been adequately addressed, as is the case for the release of pollutant mixtures linked to extreme weather events. In this context, microbial communities provide critical ecosystem services as they are the only biological compartment capable of degrading or transforming pollutants. Through the use of 16S rRNA gene metabarcoding of environmental DNA at different seasons and during a flood event in a typical Mediterranean coastal river, we show that the impacts of multipollution phenomena on structural shifts in the particle-attached riverine bacteriome were greater than those of seasonality. Key players were identified via multivariate statistical modelling combined with network module eigengene analysis. These included species highly resistant to pollutants as well as pathogens. Their rapid response to contaminant mixtures makes them ideal candidates as potential early biosignatures of multipollution stress. Multiple resistance gene transfer is likely enhanced with drastic consequences for the environment and human-health, particularly in a scenario of intensification of extreme hydrological events.
Asunto(s)
Ecosistema , Monitoreo del Ambiente , Contaminantes Ambientales , Región Mediterránea , ARN Ribosómico 16S , RíosRESUMEN
Long non-coding RNAs (LNC) regulate numerous biological processes. In contrast to human, the identification of LNC in farm species, like chicken, is still lacunar. We propose a catalogue of 52,075 chicken genes enriched in LNC ( http://www.fragencode.org/ ), built from the Ensembl reference extended using novel LNC modelled here from 364 RNA-seq and LNC from four public databases. The Ensembl reference grew from 4,643 to 30,084 LNC, of which 59% and 41% with expression ≥ 0.5 and ≥ 1 TPM respectively. Characterization of these LNC relatively to the closest protein coding genes (PCG) revealed that 79% of LNC are in intergenic regions, as in other species. Expression analysis across 25 tissues revealed an enrichment of co-expressed LNC:PCG pairs, suggesting co-regulation and/or co-function. As expected LNC were more tissue-specific than PCG (25% vs. 10%). Similarly to human, 16% of chicken LNC hosted one or more miRNA. We highlighted a new chicken LNC, hosting miR155, conserved in human, highly expressed in immune tissues like miR155, and correlated with immunity-related PCG in both species. Among LNC:PCG pairs tissue-specific in the same tissue, we revealed an enrichment of divergent pairs with the PCG coding transcription factors, as for example LHX5, HXD3 and TBX4, in both human and chicken.
Asunto(s)
Pollos/genética , Biología Computacional/métodos , Anotación de Secuencia Molecular/métodos , ARN Largo no Codificante/genética , Animales , Atlas como Asunto , Proteínas Aviares/genética , Perfilación de la Expresión Génica , Regulación de la Expresión Génica , Redes Reguladoras de Genes , MicroARNs/genética , Especificidad de Órganos , Análisis de Secuencia de ARN , Distribución TisularRESUMEN
Rainbow trout has a male heterogametic (XY) sex determination system controlled by a major sex-determining gene, sdY. Unexpectedly, a few phenotypically masculinised fish are regularly observed in all-female farmed trout stocks. To better understand the genetic determinism underlying spontaneous maleness in XX-rainbow trout, we recorded the phenotypic sex of 20,210 XX-rainbow trout from a French farm population at 10 and 15 months post-hatching. The overall masculinisation rate was 1.45%. We performed two genome-wide association studies (GWAS) on a subsample of 1139 individuals classified as females, intersex or males using either medium-throughput genotyping (31,811 SNPs) or whole-genome sequencing (WGS, 8.7 million SNPs). The genomic heritability of maleness ranged between 0.48 and 0.62 depending on the method and the number of SNPs used for the estimation. At the 31K SNPs level, we detected four QTL on three chromosomes (Omy1, Omy12 and Omy20). Using WGS information, we narrowed down the positions of the two QTL detected on Omy1 to 96 kb and 347 kb respectively, with the second QTL explaining up to 14% of the total genetic variance of maleness. Within this QTL, we detected three putative candidate genes, fgfa8, cyp17a1 and an uncharacterised protein (LOC110527930), which might be involved in spontaneous maleness of XX-female rainbow trout.
Asunto(s)
Genotipo , Oncorhynchus mykiss/genética , Procesos de Determinación del Sexo , Secuenciación Completa del Genoma , Animales , Femenino , Masculino , FenotipoRESUMEN
Estimating the evolutionary potential of quantitative traits and reliably predicting responses to selection in wild populations are important challenges in evolutionary biology. The genomic revolution has opened up opportunities for measuring relatedness among individuals with precision, enabling pedigree-free estimation of trait heritabilities in wild populations. However, until now, most quantitative genetic studies based on a genomic relatedness matrix (GRM) have focused on long-term monitored populations for which traditional pedigrees were also available, and have often had access to knowledge of genome sequence and variability. Here, we investigated the potential of RAD-sequencing for estimating heritability in a free-ranging roe deer (Capreolous capreolus) population for which no prior genomic resources were available. We propose a step-by-step analytical framework to optimize the quality and quantity of the genomic data and explore the impact of the single nucleotide polymorphism (SNP) calling and filtering processes on the GRM structure and GRM-based heritability estimates. As expected, our results show that sequence coverage strongly affects the number of recovered loci, the genotyping error rate and the amount of missing data. Ultimately, this had little effect on heritability estimates and their standard errors, provided that the GRM was built from a minimum number of loci (above 7,000). Genomic relatedness matrix-based heritability estimates thus appear robust to a moderate level of genotyping errors in the SNP data set. We also showed that quality filters, such as the removal of low-frequency variants, affect the relatedness structure of the GRM, generating lower h2 estimates. Our work illustrates the huge potential of RAD-sequencing for estimating GRM-based heritability in virtually any natural population.