RESUMO
Despite the growth of Open Access, potentially illegally circumventing paywalls to access scholarly publications is becoming a more mainstream phenomenon. The web service Sci-Hub is amongst the biggest facilitators of this, offering free access to around 62 million publications. So far it is not well studied how and why its users are accessing publications through Sci-Hub. By utilizing the recently released corpus of Sci-Hub and comparing it to the data of ~28 million downloads done through the service, this study tries to address some of these questions. The comparative analysis shows that both the usage and complete corpus is largely made up of recently published articles, with users disproportionately favoring newer articles and 35% of downloaded articles being published after 2013. These results hint that embargo periods before publications become Open Access are frequently circumnavigated using Guerilla Open Access approaches like Sci-Hub. On a journal level, the downloads show a bias towards some scholarly disciplines, especially Chemistry, suggesting increased barriers to access for these. Comparing the use and corpus on a publisher level, it becomes clear that only 11% of publishers are highly requested in comparison to the baseline frequency, while 45% of all publishers are significantly less accessed than expected. Despite this, the oligopoly of publishers is even more remarkable on the level of content consumption, with 80% of all downloads being published through only 9 publishers. All of this suggests that Sci-Hub is used by different populations and for a number of different reasons, and that there is still a lack of access to the published scientific record. A further analysis of these openly available data resources will undoubtedly be valuable for the investigation of academic publishing.
RESUMO
We explored the characteristics and motivations of people who, having obtained their genetic or genomic data from Direct-To-Consumer genetic testing (DTC-GT) companies, voluntarily decide to share them on the publicly accessible web platform openSNP. The study is the first attempt to describe open data sharing activities undertaken by individuals without institutional oversight. In the paper we provide a detailed overview of the distribution of the demographic characteristics and motivations of people engaged in genetic or genomic open data sharing. The geographical distribution of the respondents showed the USA as dominant. There was no significant gender divide, the age distribution was broad, educational background varied and respondents with and without children were equally represented. Health, even though prominent, was not the respondents' primary or only motivation to be tested. As to their motivations to openly share their data, 86.05% indicated wanting to learn about themselves as relevant, followed by contributing to the advancement of medical research (80.30%), improving the predictability of genetic testing (76.02%) and considering it fun to explore genotype and phenotype data (75.51%). Whereas most respondents were well aware of the privacy risks of their involvement in open genetic data sharing and considered the possibility of direct, personal repercussions troubling, they estimated the risk of this happening to be negligible. Our findings highlight the diversity of DTC-GT consumers who decide to openly share their data. Instead of focusing exclusively on health-related aspects of genetic testing and data sharing, our study emphasizes the importance of taking into account benefits and risks that stretch beyond the health spectrum. Our results thus lend further support to the call for a broader and multi-faceted conceptualization of genomic utility.
Assuntos
Genômica , Serviços de Informação , Triagem e Testes Direto ao Consumidor , Feminino , Humanos , MasculinoRESUMO
The ever-growing availability of high-quality genotypes for a multitude of species has enabled researchers to explore the underlying genetic architecture of complex phenotypes at an unprecedented level of detail using genome-wide association studies (GWAS). The systematic comparison of results obtained from GWAS of different traits opens up new possibilities, including the analysis of pleiotropic effects. Other advantages that result from the integration of multiple GWAS are the ability to replicate GWAS signals and to increase statistical power to detect such signals through meta-analyses. In order to facilitate the simple comparison of GWAS results, we present easyGWAS, a powerful, species-independent online resource for computing, storing, sharing, annotating, and comparing GWAS. The easyGWAS tool supports multiple species, the uploading of private genotype data and summary statistics of existing GWAS, as well as advanced methods for comparing GWAS results across different experiments and data sets in an interactive and user-friendly interface. easyGWAS is also a public data repository for GWAS data and summary statistics and already includes published data and results from several major GWAS. We demonstrate the potential of easyGWAS with a case study of the model organism Arabidopsis thaliana, using flowering and growth-related traits.
Assuntos
Biologia Computacional/métodos , Genoma de Planta/genética , Estudo de Associação Genômica Ampla/métodos , Polimorfismo de Nucleotídeo Único , Arabidopsis/genética , Arabidopsis/crescimento & desenvolvimento , Flores/genética , Flores/crescimento & desenvolvimento , Genótipo , Humanos , Fenótipo , Reprodutibilidade dos Testes , Software , Interface Usuário-ComputadorRESUMO
PREMISE OF THE STUDY: Polymorphic microsatellite markers were developed for the lichen species Cetraria aculeata (Parmeliaceae) to study fine-scale population diversity and phylogeographic structure. METHODS AND RESULTS: Using Illumina HiSeq and MiSeq, 15 fungus-specific microsatellite markers were developed and tested on 81 specimens from four populations from Spain. The number of alleles ranged from four to 13 alleles per locus with a mean of 7.9, and average gene diversities varied from 0.40 to 0.73 over four populations. The amplification rates of 10 markers (CA01-CA10) in populations of C. aculeata exceeded 85%. The markers also amplified across a range of closely related species, except for locus CA05, which did not amplify in C. australiensis and C. "panamericana," and locus CA10 which did not amplify in C. australiensis. CONCLUSIONS: The identified microsatellite markers will be used to study the genetic diversity and phylogeographic structure in populations of C. aculeata in western Eurasia.
RESUMO
BACKGROUND: Information transfer in mammalian communication networks is often based on the deposition of excreta in latrines. Depending on the intended receiver(s), latrines are either formed at territorial boundaries (between-group communication) or in core areas of home ranges (within-group communication). The relative importance of both types of marking behavior should depend, amongst other factors, on population densities and social group sizes, which tend to differ between urban and rural wildlife populations. Our study is the first to assess (direct and indirect) anthropogenic influences on mammalian latrine-based communication networks along a rural-to-urban gradient in European rabbits (Oryctolagus cuniculus) living in urban, suburban and rural areas in and around Frankfurt am Main (Germany). RESULTS: The proportion of latrines located in close proximity to the burrow was higher at rural study sites compared to urban and suburban ones. At rural sites, we found the largest latrines and highest latrine densities close to the burrow, suggesting that core marking prevailed. By contrast, latrine dimensions and densities increased with increasing distance from the burrow in urban and suburban populations, suggesting a higher importance of peripheral marking. CONCLUSIONS: Increased population densities, but smaller social group sizes in urban rabbit populations may lead to an increased importance of between-group communication and thus, favor peripheral over core marking. Our study provides novel insights into the manifold ways by which man-made habitat alterations along a rural-to-urban gradient directly and indirectly affect wildlife populations, including latrine-based communication networks.
Assuntos
Animais Selvagens/fisiologia , Coelhos/fisiologia , Distribuição Animal , Migração Animal , Animais , Ecossistema , Feminino , Masculino , População Rural , Reforma UrbanaRESUMO
Whole-genome shotgun sequencing of multispecies communities using only a single library layout is commonly used to assess taxonomic and functional diversity of microbial assemblages. Here, we investigate to what extent such metagenome skimming approaches are applicable for in-depth genomic characterizations of eukaryotic communities, for example lichens. We address how to best assemble a particular eukaryotic metagenome skimming data, what pitfalls can occur, and what genome quality can be expected from these data. To facilitate a project-specific benchmarking, we introduce the concept of twin sets, simulated data resembling the outcome of a particular metagenome sequencing study. We show that the quality of genome reconstructions depends essentially on assembler choice. Individual tools, including the metagenome assemblers Omega and MetaVelvet, are surprisingly sensitive to low and uneven coverages. In combination with the routine of assembly parameter choice to optimize the assembly N50 size, these tools can preclude an entire genome from the assembly. In contrast, MIRA, an all-purpose overlap assembler, and SPAdes, a multisized de Bruijn graph assembler, facilitate a comprehensive view on the individual genomes across a wide range of coverage ratios. Testing assemblers on a real-world metagenome skimming data from the lichen Lasallia pustulata demonstrates the applicability of twin sets for guiding method selection. Furthermore, it reveals that the assembly outcome for the photobiont Trebouxia sp. falls behind the a priori expectation given the simulations. Although the underlying reasons remain still unclear, this highlights that further studies on this organism require special attention during sequence data generation and downstream analysis.
Assuntos
Biota , Biologia Computacional/métodos , Líquens/genética , Metagenoma , Metagenômica/métodos , Análise de Sequência de DNA/métodos , Ascomicetos/classificação , Ascomicetos/genética , Clorófitas/classificação , Clorófitas/genética , Simulação por Computador , Líquens/classificaçãoRESUMO
In recent years, there have been prominent calls for a new social contract that accords a more central role to citizens in health research. Typically, this has been understood as citizens and patients having a greater voice and role within the standard research enterprise. Beyond this, however, it is important that the renegotiated contract specifically addresses the oversight of a new, path-breaking approach to health research: participant-led research. In light of the momentum behind participant-led research and its potential to advance health knowledge by challenging and complementing traditional research, it is vital for all stakeholders to work together in securing the conditions that will enable it to flourish.
Assuntos
Ética em Pesquisa , Experimentação Humana/ética , Política Pública , Sujeitos da Pesquisa , Humanos , Política Pública/legislação & jurisprudência , Política Pública/tendências , Responsabilidade SocialRESUMO
BACKGROUND: We describe the pioneering experience of a Spanish family pursuing the goal of understanding their own personal genetic data to the fullest possible extent using Direct to Consumer (DTC) tests. With full informed consent from the Corpas family, all genotype, exome and metagenome data from members of this family, are publicly available under a public domain Creative Commons 0 (CC0) license waiver. All scientists or companies analysing these data ("the Corpasome") were invited to return results to the family. METHODS: We released 5 genotypes, 4 exomes, 1 metagenome from the Corpas family via a blog and figshare under a public domain license, inviting scientists to join the crowdsourcing efforts to analyse the genomes in return for coauthorship or acknowldgement in derived papers. Resulting analysis data were compiled via social media and direct email. RESULTS: Here we present the results of our investigations, combining the crowdsourced contributions and our own efforts. Four companies offering annotations for genomic variants were applied to four family exomes: BIOBASE, Ingenuity, Diploid, and GeneTalk. Starting from a common VCF file and after selecting for significant results from company reports, we find no overlap among described annotations. We additionally report on a gut microbiome analysis of a member of the Corpas family. CONCLUSIONS: This study presents an analysis of a diverse set of tools and methods offered by four DTC companies. The striking discordance of the results mirrors previous findings with respect to DTC analysis of SNP chip data, and highlights the difficulties of using DTC data for preventive medical care. To our knowledge, the data and analysis results from our crowdsourced study represent the most comprehensive exome and analysis for a family quartet using solely DTC data generation to date.
Assuntos
Crowdsourcing , Família , Testes Genéticos , Genômica , Crowdsourcing/métodos , Exoma , Feminino , Frequência do Gene , Testes Genéticos/métodos , Genômica/métodos , Genótipo , Humanos , Masculino , Metagenoma , Linhagem , Fenótipo , Polimorfismo de Nucleotídeo Único , Medicina de Precisão/métodos , Característica Quantitativa Herdável , EspanhaRESUMO
BACKGROUND: Life history traits like developmental time, age and size at maturity are directly related to fitness in all organisms and play a major role in adaptive evolution and speciation processes. Comparative genomic or transcriptomic approaches to identify positively selected genes involved in species divergence can help to generate hypotheses on the driving forces behind speciation. Here we use a bottom-up approach to investigate this hypothesis by comparative analysis of orthologous transcripts of four closely related European Radix species. RESULTS: Snails of the genus Radix occupy species specific distribution ranges with distinct climatic niches, indicating a potential for natural selection driven speciation based on ecological niche differentiation. We then inferred phylogenetic relationships among the four Radix species based on whole mt-genomes plus 23 nuclear loci. Three different tests to infer selection and changes in amino acid properties yielded a total of 134 genes with signatures of positive selection. The majority of these genes belonged to the functional gene ontology categories "reproduction" and "genitalia" with an overrepresentation of the functions "development" and "growth rate". CONCLUSIONS: We show here that Radix species divergence may be primarily enforced by selection on life history traits such as (larval-) development and growth rate. We thus hypothesise that life history differences may confer advantages under the according climate regimes, e.g., species occupying warmer and dryer habitats might have a fitness advantage with fast developing susceptible life stages, which are more tolerant to habitat desiccation.
Assuntos
Seleção Genética , Caramujos/classificação , Caramujos/genética , Animais , Evolução Biológica , Clima , Ecossistema , Europa (Continente) , Perfilação da Expressão Gênica , Regulação da Expressão Gênica , Filogenia , Caramujos/crescimento & desenvolvimento , Especificidade da EspécieRESUMO
Populations that repeatedly adapt to the same environmental stressor offer a unique opportunity to study adaptation, especially if there are a priori predictions about the genetic basis underlying phenotypic evolution. Hydrogen sulphide (H2S) blocks the cytochrome-c oxidase complex (COX), predicting the evolution of decreased H2S susceptibility of the COX in three populations in the Poecilia mexicana complex that have colonized H2S-containing springs. Here, we demonstrate that decreased H2S susceptibility of COX evolved in parallel in two sulphide lineages, as evidenced by shared amino acid substitutions in cox1 and cox3 genes. One of the shared substitutions likely triggers conformational changes in COX1 blocking the access of H2S. In a third sulphide population, we detect no decreased H2S susceptibility of COX, suggesting that H2S resistance is achieved through another mechanism. Our study thus demonstrates that even closely related lineages follow both parallel and disparate molecular evolutionary paths to adaptation in response to the same selection pressure.
Assuntos
Adaptação Fisiológica/genética , Complexo IV da Cadeia de Transporte de Elétrons/genética , Sulfeto de Hidrogênio/efeitos adversos , Poecilia/genética , Substituição de Aminoácidos , Animais , Sequência de Bases , Meio Ambiente , Evolução Molecular , Mitocôndrias/genética , Filogenia , Análise de Sequência de DNARESUMO
Genome-Wide Association Studies are widely used to correlate phenotypic traits with genetic variants. These studies usually compare the genetic variation between two groups to single out certain Single Nucleotide Polymorphisms (SNPs) that are linked to a phenotypic variation in one of the groups. However, it is necessary to have a large enough sample size to find statistically significant correlations. Direct-To-Consumer (DTC) genetic testing can supply additional data: DTC-companies offer the analysis of a large amount of SNPs for an individual at low cost without the need to consult a physician or geneticist. Over 100,000 people have already been genotyped through Direct-To-Consumer genetic testing companies. However, this data is not public for a variety of reasons and thus cannot be used in research. It seems reasonable to create a central open data repository for such data. Here we present the web platform openSNP, an open database which allows participants of Direct-To-Consumer genetic testing to publish their genetic data at no cost along with phenotypic information. Through this crowdsourced effort of collecting genetic and phenotypic information, openSNP has become a resource for a wide area of studies, including Genome-Wide Association Studies. openSNP is hosted at http://www.opensnp.org, and the code is released under MIT-license at http://github.com/gedankenstuecke/snpr.
Assuntos
Crowdsourcing , Genômica , Polimorfismo de Nucleotídeo Único , Software , Testes Genéticos , Estudo de Associação Genômica Ampla/estatística & dados numéricos , Genótipo , Humanos , Disseminação de Informação/métodos , Internet , Fenótipo , Medicina de PrecisãoRESUMO
BACKGROUND: There is a lack of understanding the evolutionary forces driving niche segregation of closely related organisms. In addition, pinpointing the genes driving ecological divergence is a key goal in molecular ecology. Here, larval transcriptome sequences obtained by next-generation-sequencing are used to address these issues in a morphologically cryptic sister species pair of non-biting midges (Chironomus riparius and C. piger). RESULTS: More than eight thousand orthologous open reading frames were screened for interspecific divergence and intraspecific polymorphisms. Despite a small mean sequence divergence of 1.53% between the sister species, 25.1% of 18,115 observed amino acid substitutions were inferred by α statistics to be driven by positive selection. Applying McDonald-Kreitman tests to 715 alignments of gene orthologues identified eleven (1.5%) genes driven by positive selection. CONCLUSIONS: Three candidate genes were identified as potentially responsible for the observed niche segregation concerning nitrite concentration, habitat temperature and water conductivity. Additionally, signs of positive selection in the hydrogen sulfide detoxification pathway were detected, providing a new plausible hypothesis for the species' ecological differentiation. Finally, a divergently selected, nuclear encoded mitochondrial ribosomal protein may contribute to reproductive isolation due to cytonuclear coevolution.
Assuntos
Chironomidae/genética , Fenômenos Ecológicos e Ambientais , Genômica , Filogenia , Animais , Evolução Molecular , Perfilação da Expressão Gênica , Anotação de Sequência Molecular , Polimorfismo Genético , Seleção Genética , Alinhamento de Sequência , Análise de SequênciaRESUMO
Direct-to-consumer (DTC) genetic testing is a recent commercial endeavor that allows the general public to access personal genomic data. The growing availability of personal genomic data has in turn stimulated the development of non-commercial tools for DTC data analysis. Despite this new wealth of public resources, no systematic research has been carried out to assess these tools for interpretation of DTC data. Here, we provide an initial analysis benchmark in the context of a whole family, using single nucleotide polymorphism (SNP) data. Five blood-related DTC SNP chip data tests were analyzed in conjunction with one whole exome sequence. We report findings related to genomic similarity between individuals, genetic risks and an overall assessment of data quality; thus providing an evaluation of the current potential of public domain analysis tools for personal genomics. We envisage that as the use of personal genome tests spreads to the general population, publicly available tools will have a more prominent role in the interpretation of genomic data in the context of health risks and ancestry.