RESUMEN
Phylogenetic models have become increasingly complex, and phylogenetic data sets have expanded in both size and richness. However, current inference tools lack a model specification language that can concisely describe a complete phylogenetic analysis while remaining independent of implementation details. We introduce a new lightweight and concise model specification language, 'LPhy', which is designed to be both human and machine-readable. A graphical user interface accompanies 'LPhy', allowing users to build models, simulate data, and create natural language narratives describing the models. These narratives can serve as the foundation for manuscript method sections. Additionally, we present a command-line interface for converting LPhy-specified models into analysis specification files (in XML format) compatible with the BEAST2 software platform. Collectively, these tools aim to enhance the clarity of descriptions and reporting of probabilistic models in phylogenetic studies, ultimately promoting reproducibility of results.
Asunto(s)
Lenguaje , Programas Informáticos , Humanos , Filogenia , Reproducibilidad de los Resultados , Modelos Estadísticos , Interfaz Usuario-ComputadorRESUMEN
Single-cell sequencing provides a new way to explore the evolutionary history of cells. Compared to traditional bulk sequencing, where a population of heterogeneous cells is pooled to form a single observation, single-cell sequencing isolates and amplifies genetic material from individual cells, thereby preserving the information about the origin of the sequences. However, single-cell data are more error-prone than bulk sequencing data due to the limited genomic material available per cell. Here, we present error and mutation models for evolutionary inference of single-cell data within a mature and extensible Bayesian framework, BEAST2. Our framework enables integration with biologically informative models such as relaxed molecular clocks and population dynamic models. Our simulations show that modeling errors increase the accuracy of relative divergence times and substitution parameters. We reconstruct the phylogenetic history of a colorectal cancer patient and a healthy patient from single-cell DNA sequencing data. We find that the estimated times of terminal splitting events are shifted forward in time compared to models which ignore errors. We observed that not accounting for errors can overestimate the phylogenetic diversity in single-cell DNA sequencing data. We estimate that 30-50% of the apparent diversity can be attributed to error. Our work enables a full Bayesian approach capable of accounting for errors in the data within the integrative Bayesian software framework BEAST2.
Asunto(s)
Neoplasias , Programas Informáticos , Teorema de Bayes , Evolución Molecular , Genómica , Humanos , Modelos Genéticos , FilogeniaRESUMEN
The rapid spread of highly pathogenic avian influenza (HPAI) A (H5N1) viruses in Southeast Asia in 2004 prompted the New Zealand Ministry for Primary Industries to expand its avian influenza surveillance in wild birds. A total of 18,693 birds were sampled between 2004 and 2020, including migratory shorebirds (in 2004-2009), other coastal species (in 2009-2010), and resident waterfowl (in 2004-2020). No avian influenza viruses (AIVs) were isolated from cloacal or oropharyngeal samples from migratory shorebirds or resident coastal species. Two samples from red knots (Calidris canutus) tested positive by influenza A RT-qPCR, but virus could not be isolated and no further characterization could be undertaken. In contrast, 6179 samples from 15,740 mallards (Anas platyrhynchos) tested positive by influenza A RT-qPCR. Of these, 344 were positive for H5 and 51 for H7. All H5 and H7 viruses detected were of low pathogenicity confirmed by a lack of multiple basic amino acids at the hemagglutinin (HA) cleavage site. Twenty H5 viruses (six different neuraminidase [NA] subtypes) and 10 H7 viruses (two different NA subtypes) were propagated and characterized genetically. From H5- or H7-negative samples that tested positive by influenza A RT-qPCR, 326 AIVs were isolated, representing 41 HA/NA combinations. The most frequently isolated subtypes were H4N6, H3N8, H3N2, and H10N3. Multivariable logistic regression analysis of the relations between the location and year of sampling, and presence of AIV in individual waterfowl showed that the AIV risk at a given location varied from year to year. The H5 and H7 isolates both formed monophyletic HA groups. The H5 viruses were most closely related to North American lineages, whereas the H7 viruses formed a sister cluster relationship with wild bird viruses of the Eurasian and Australian lineages. Bayesian analysis indicates that the H5 and H7 viruses have circulated in resident mallards in New Zealand for some time. Correspondingly, we found limited evidence of influenza viruses in the major migratory bird populations visiting New Zealand. Findings suggest a low probability of introduction of HPAI viruses via long-distance bird migration and a unique epidemiology of AIV in New Zealand.