RESUMEN
CDC has used national genomic surveillance since December 2020 to monitor SARS-CoV-2 variants that have emerged throughout the COVID-19 pandemic, including the Omicron variant. This report summarizes U.S. trends in variant proportions from national genomic surveillance during January 2022-May 2023. During this period, the Omicron variant remained predominant, with various descendant lineages reaching national predominance (>50% prevalence). During the first half of 2022, BA.1.1 reached predominance by the week ending January 8, 2022, followed by BA.2 (March 26), BA.2.12.1 (May 14), and BA.5 (July 2); the predominance of each variant coincided with surges in COVID-19 cases. The latter half of 2022 was characterized by the circulation of sublineages of BA.2, BA.4, and BA.5 (e.g., BQ.1 and BQ.1.1), some of which independently acquired similar spike protein substitutions associated with immune evasion. By the end of January 2023, XBB.1.5 became predominant. As of May 13, 2023, the most common circulating lineages were XBB.1.5 (61.5%), XBB.1.9.1 (10.0%), and XBB.1.16 (9.4%); XBB.1.16 and XBB.1.16.1 (2.4%), containing the K478R substitution, and XBB.2.3 (3.2%), containing the P521S substitution, had the fastest doubling times at that point. Analytic methods for estimating variant proportions have been updated as the availability of sequencing specimens has declined. The continued evolution of Omicron lineages highlights the importance of genomic surveillance to monitor emerging variants and help guide vaccine development and use of therapeutics.
Asunto(s)
COVID-19 , SARS-CoV-2 , Humanos , SARS-CoV-2/genética , Pandemias , COVID-19/epidemiología , GenómicaRESUMEN
Genomic surveillance is a critical tool for tracking emerging variants of SARS-CoV-2 (the virus that causes COVID-19), which can exhibit characteristics that potentially affect public health and clinical interventions, including increased transmissibility, illness severity, and capacity for immune escape. During June 2021-January 2022, CDC expanded genomic surveillance data sources to incorporate sequence data from public repositories to produce weighted estimates of variant proportions at the jurisdiction level and refined analytic methods to enhance the timeliness and accuracy of national and regional variant proportion estimates. These changes also allowed for more comprehensive variant proportion estimation at the jurisdictional level (i.e., U.S. state, district, territory, and freely associated state). The data in this report are a summary of findings of recent proportions of circulating variants that are updated weekly on CDC's COVID Data Tracker website to enable timely public health action. The SARS-CoV-2 Delta (B.1.617.2 and AY sublineages) variant rose from 1% to >50% of viral lineages circulating nationally during 8 weeks, from May 1-June 26, 2021. Delta-associated infections remained predominant until being rapidly overtaken by infections associated with the Omicron (B.1.1.529 and BA sublineages) variant in December 2021, when Omicron increased from 1% to >50% of circulating viral lineages during a 2-week period. As of the week ending January 22, 2022, Omicron was estimated to account for 99.2% (95% CI = 99.0%-99.5%) of SARS-CoV-2 infections nationwide, and Delta for 0.7% (95% CI = 0.5%-1.0%). The dynamic landscape of SARS-CoV-2 variants in 2021, including Delta- and Omicron-driven resurgences of SARS-CoV-2 transmission across the United States, underscores the importance of robust genomic surveillance efforts to inform public health planning and practice.
Asunto(s)
COVID-19/epidemiología , COVID-19/virología , SARS-CoV-2/genética , Centers for Disease Control and Prevention, U.S. , Genómica , Humanos , Prevalencia , Vigilancia en Salud Pública/métodos , Estados Unidos/epidemiologíaRESUMEN
Flies are one of four superradiations of insects (along with beetles, wasps, and moths) that account for the majority of animal life on Earth. Diptera includes species known for their ubiquity (Musca domestica house fly), their role as pests (Anopheles gambiae malaria mosquito), and their value as model organisms across the biological sciences (Drosophila melanogaster). A resolved phylogeny for flies provides a framework for genomic, developmental, and evolutionary studies by facilitating comparisons across model organisms, yet recent research has suggested that fly relationships have been obscured by multiple episodes of rapid diversification. We provide a phylogenomic estimate of fly relationships based on molecules and morphology from 149 of 157 families, including 30 kb from 14 nuclear loci and complete mitochondrial genomes combined with 371 morphological characters. Multiple analyses show support for traditional groups (Brachycera, Cyclorrhapha, and Schizophora) and corroborate contentious findings, such as the anomalous Deuterophlebiidae as the sister group to all remaining Diptera. Our findings reveal that the closest relatives of the Drosophilidae are highly modified parasites (including the wingless Braulidae) of bees and other insects. Furthermore, we use micro-RNAs to resolve a node with implications for the evolution of embryonic development in Diptera. We demonstrate that flies experienced three episodes of rapid radiation--lower Diptera (220 Ma), lower Brachycera (180 Ma), and Schizophora (65 Ma)--and a number of life history transitions to hematophagy, phytophagy, and parasitism in the history of fly evolution over 260 million y.
Asunto(s)
Adaptación Biológica/genética , Evolución Biológica , Dípteros/anatomía & histología , Dípteros/genética , Filogenia , Animales , Secuencia de Bases , Teorema de Bayes , Biblioteca de Genes , Funciones de Verosimilitud , MicroARNs/genética , Modelos Genéticos , Datos de Secuencia Molecular , Análisis de Secuencia de ADN , Especificidad de la EspecieRESUMEN
In scientific research, integration and synthesis require a common understanding of where data come from, how much they can be trusted, and what they may be used for. To make such an understanding computer-accessible requires standards for exchanging richly annotated data. The challenges of conveying reusable data are particularly acute in regard to evolutionary comparative analysis, which comprises an ever-expanding list of data types, methods, research aims, and subdisciplines. To facilitate interoperability in evolutionary comparative analysis, we present NeXML, an XML standard (inspired by the current standard, NEXUS) that supports exchange of richly annotated comparative data. NeXML defines syntax for operational taxonomic units, character-state matrices, and phylogenetic trees and networks. Documents can be validated unambiguously. Importantly, any data element can be annotated, to an arbitrary degree of richness, using a system that is both flexible and rigorous. We describe how the use of NeXML by the TreeBASE and Phenoscape projects satisfies user needs that cannot be satisfied with other available file formats. By relying on XML Schema Definition, the design of NeXML facilitates the development and deployment of software for processing, transforming, and querying documents. The adoption of NeXML for practical use is facilitated by the availability of (1) an online manual with code samples and a reference to all defined elements and attributes, (2) programming toolkits in most of the languages used commonly in evolutionary informatics, and (3) input-output support in several widely used software applications. An active, open, community-based development process enables future revision and expansion of NeXML.
Asunto(s)
Evolución Biológica , Biología Computacional/normas , Lenguajes de Programación , Biodiversidad , Clasificación , Informática , Modelos Biológicos , Filogenia , Programas InformáticosRESUMEN
Functional studies of the methuselah/methuselah-like (mth/mthl) gene family have focused on the founding member mth, but little is known regarding the developmental functions of this receptor or any of its paralogs. We undertook a comprehensive analysis of developmental expression and sequence divergence in the mth/mthl gene family. Using in situ hybridization techniques, we detect expression of six genes (mthl1, 5, 9, 11, 13, and 14) in the embryo during gastrulation and development of the gut, heart, and lymph glands. Four receptors (mthl3, 4, 6, and 8) are expressed in the larval central nervous system, imaginal discs, or both, and two receptors (mthl10 and mth) are expressed in both embryos and larvae. Phylogenetic analysis of all mth/mthl genes in five Drosophila species, mosquito and flour beetle structured the mth/mthl family into several subclades. mthl1, 5, and 14 are present in most species, each forming a separate clade. A newly identified Drosophila mthl gene (CG31720; herein mthl15) formed another ancient clade. The remaining Drosophila receptors, including mth, are members of a large "superclade" that diversified relatively recently during dipteran evolution, in many cases within the melanogaster subgroup. Comparing the expression patterns of the mth/mthl "superclade" paralogs to the embryonic expression of the singleton ortholog in Tribolium suggests both subfunctionalization and acquisition of novel functionalities. Taken together, our findings shed novel light on mth as a young member of an adaptively evolving developmental gene family.
Asunto(s)
Adaptación Biológica/genética , Proteínas de Drosophila/genética , Drosophila melanogaster/genética , Evolución Molecular , Regulación del Desarrollo de la Expresión Génica/genética , Familia de Multigenes/genética , Filogenia , Receptores Acoplados a Proteínas G/genética , Adaptación Biológica/fisiología , Animales , Teorema de Bayes , Biología Computacional , Perfilación de la Expresión Génica , Regulación del Desarrollo de la Expresión Génica/fisiología , Hibridación in Situ , Modelos Genéticos , Especificidad de la EspecieRESUMEN
Deep-level arthropod phylogeny has been in a state of upheaval ever since the emergence of molecular tree reconstruction approaches. While a consensus has settled in that hexapods are more closely related to crustaceans than to myriapods, the phylogenetic position of the latter has remained a matter of debate. Mitochondrial, nuclear, and genome-scale studies have proposed rejecting the long-standing superclade Mandibulata, which unites myriapods with insects and crustaceans, in favor of a clade that unites myriapods with chelicerates and has become known as Paradoxapoda or Myriochelata. Here we discuss the progress, problems, and prospects of arriving at the final arthropod tree.
Asunto(s)
Artrópodos/clasificación , Ácaros/clasificación , Filogenia , Animales , Artrópodos/genética , Evolución Molecular , Ácaros/genéticaRESUMEN
BACKGROUND: Phyloinformatic analyses involve large amounts of data and metadata of complex structure. Collecting, processing, analyzing, visualizing and summarizing these data and metadata should be done in steps that can be automated and reproduced. This requires flexible, modular toolkits that can represent, manipulate and persist phylogenetic data and metadata as objects with programmable interfaces. RESULTS: This paper presents Bio::Phylo, a Perl5 toolkit for phyloinformatic analysis. It implements classes and methods that are compatible with the well-known BioPerl toolkit, but is independent from it (making it easy to install) and features a richer API and a data model that is better able to manage the complex relationships between different fundamental data and metadata objects in phylogenetics. It supports commonly used file formats for phylogenetic data including the novel NeXML standard, which allows rich annotations of phylogenetic data to be stored and shared. Bio::Phylo can interact with BioPerl, thereby giving access to the file formats that BioPerl supports. Many methods for data simulation, transformation and manipulation, the analysis of tree shape, and tree visualization are provided. CONCLUSIONS: Bio::Phylo is composed of 59 richly documented Perl5 modules. It has been deployed successfully on a variety of computer architectures (including various Linux distributions, Mac OS X versions, Windows, Cygwin and UNIX-like systems). It is available as open source (GPL) software from http://search.cpan.org/dist/Bio-Phylo.
Asunto(s)
Biología Computacional/métodos , Filogenia , Programas Informáticos , Sistemas de ComputaciónRESUMEN
The evolutionary origin of the Drosophila Pax transcription factor gene eyegone (eyg) has long been enigmatic owing to the failure in detecting orthologs in other species and the unusual N-terminal truncation of the DNA-binding paired domain (PD). Based on the discovery of eyg orthologs in representatives of hemichordate phyla, we show that the origin of eyg predated metazoan diversification and that the PD experienced similar but independent N-terminal modifications in the lineages to sea urchins and insects. Sequence conservation patterns further raise the possibility of persisting functionality in the N-terminal PD of strongly modified eyg orthologs. Finally, we note that the evolutionary histories of eyg and the vertebrate Pax6 isoform 5a, which have been considered functional homologs, are not correlated. Taken together, these findings identify Drosophila eyg as the baptizing member of an ancient Pax gene subfamily and recommend abandoning its classification as Pax6(5a)-related gene.
Asunto(s)
Drosophila/genética , Evolución Molecular , Genoma/genética , Proteínas de Homeodominio/genética , Factores de Transcripción Paired Box/genética , Isoformas de Proteínas/genética , Secuencia de Aminoácidos , Animales , Animales Modificados Genéticamente/genética , Proteínas de Unión al ADN/genética , Proteínas de Unión al ADN/metabolismo , Drosophila/metabolismo , Proteínas de Drosophila/genética , Proteínas de Drosophila/metabolismo , Proteínas del Ojo/genética , Proteínas del Ojo/metabolismo , Regulación del Desarrollo de la Expresión Génica , Proteínas de Homeodominio/metabolismo , Datos de Secuencia Molecular , Mutación , Factores de Transcripción Paired Box/metabolismo , Isoformas de Proteínas/metabolismo , Proteínas Represoras/genéticaRESUMEN
Obligatory cave species exhibit dramatic trait modifications such as eye reduction, loss of pigmentation and an increase in touch receptors. As molecular studies of cave adaptation have largely concentrated on vertebrate models, it is not yet possible to probe for genetic universalities underlying cave adaptation. We have therefore begun to study the strongly cave-adapted small carrion beetle Ptomaphagus hirtus. For over 100 years, this flightless signature inhabitant of Mammoth Cave, the world's largest known cave system, has been considered blind despite the presence of residual lens structures. By deep sequencing of the adult head transcriptome, we discovered the transcripts of all core members of the phototransduction protein machinery. Combined with the absence of transcripts of select structural photoreceptor and eye pigmentation genes, these data suggest a reduced but functional visual system in P. hirtus. This conclusion was corroborated by a negative phototactic response of P. hirtus in light/dark choice tests. We further detected the expression of the complete circadian clock gene network in P. hirtus, raising the possibility of a role of light sensation in the regulation of oscillating processes. We speculate that P. hirtus is representative of a large number of animal species with highly reduced but persisting visual capacities in the twilight zone of the subterranean realm. These can now be studied on a broad comparative scale given the efficiency of transcript discovery by next-generation sequencing.
Asunto(s)
Adaptación Fisiológica/fisiología , Evolución Biológica , Cuevas , Péptidos y Proteínas de Señalización del Ritmo Circadiano/metabolismo , Escarabajos/fisiología , Fototransducción/genética , Adaptación Fisiológica/genética , Animales , Secuencia de Bases , Péptidos y Proteínas de Señalización del Ritmo Circadiano/genética , Biología Computacional , Demografía , Kentucky , Funciones de Verosimilitud , Modelos Genéticos , Datos de Secuencia Molecular , Estimulación Luminosa , Filogenia , Pigmentación/genética , Análisis de Secuencia de ADNRESUMEN
Legionella spp. are the cause of a severe bacterial pneumonia known as Legionnaires' disease (LD). In some cases, current genetic subtyping methods cannot resolve LD outbreaks caused by common, potentially endemic L. pneumophila (Lp) sequence types (ST), which complicates laboratory investigations and environmental source attribution. In the United States (US), ST1 is the most prevalent clinical and environmental Lp sequence type. In order to characterize the ST1 population, we sequenced 289 outbreak and non-outbreak associated clinical and environmental ST1 and ST1-variant Lp strains from the US and, together with international isolate sequences, explored their genetic and geographic diversity. The ST1 population was highly conserved at the nucleotide level; 98% of core nucleotide positions were invariant and environmental isolates unassociated with human disease (n = 99) contained ~65% more nucleotide diversity compared to clinical-sporadic (n = 139) or outbreak-associated (n = 28) ST1 subgroups. The accessory pangenome of environmental isolates was also ~30-60% larger than other subgroups and was enriched for transposition and conjugative transfer-associated elements. Up to ~10% of US ST1 genetic variation could be explained by geographic origin, but considerable genetic conservation existed among strains isolated from geographically distant states and from different decades. These findings provide new insight into the ST1 population structure and establish a foundation for interpreting genetic relationships among ST1 strains; these data may also inform future analyses for improved outbreak investigations.
Asunto(s)
Brotes de Enfermedades , Legionella pneumophila/genética , Enfermedad de los Legionarios/microbiología , Tipificación Molecular/métodos , Secuencia de Bases , Secuencia Conservada , Heterogeneidad Genética , Genotipo , Humanos , Enfermedad de los Legionarios/epidemiología , FilogeniaRESUMEN
The majority of Legionnaires' disease (LD) cases are caused by Legionella pneumophila, a genetically heterogeneous species composed of at least 17 serogroups. Previously, it was demonstrated that L. pneumophila consists of three subspecies: pneumophila, fraseri and pascullei. During an LD outbreak investigation in 2012, we detected that representatives of both subspecies fraseri and pascullei colonized the same water system and that the outbreak-causing strain was a new member of the least represented subspecies pascullei. We used partial sequence based typing consensus patterns to mine an international database for additional representatives of fraseri and pascullei subspecies. As a result, we identified 46 sequence types (STs) belonging to subspecies fraseri and two STs belonging to subspecies pascullei. Moreover, a recent retrospective whole genome sequencing analysis of isolates from New York State LD clusters revealed the presence of a fourth L. pneumophila subspecies that we have termed raphaeli. This subspecies consists of 15 STs. Comparative analysis was conducted using the genomes of multiple members of all four L. pneumophila subspecies. Whereas each subspecies forms a distinct phylogenetic clade within the L. pneumophila species, they share more average nucleotide identity with each other than with other Legionella species. Unique genes for each subspecies were identified and could be used for rapid subspecies detection. Improved taxonomic classification of L. pneumophila strains may help identify environmental niches and virulence attributes associated with these genetically distinct subspecies.
Asunto(s)
Genoma Bacteriano/genética , Legionella pneumophila/genética , Enfermedad de los Legionarios/microbiología , Hibridación Genómica Comparativa , Infección Hospitalaria/microbiología , ADN Bacteriano/genética , Brotes de Enfermedades , Humanos , Filogenia , Polimorfismo de Nucleótido Simple/genéticaRESUMEN
Legionella spp. present in some human-made water systems can cause Legionnaires' disease in susceptible individuals. Although legionellae have been isolated from the natural environment, variations in the organism's abundance over time and its relationship to aquatic microbiota are poorly understood. Here, we investigated the presence and diversity of legionellae through 16S rRNA gene amplicon and metagenomic sequencing of DNA from isolates collected from seven sites in three watersheds with varied land uses over a period of 1 year. Legionella spp. were found in all watersheds and sampling sites, comprising up to 2.1% of the bacterial community composition. The relative abundance of Legionella tended to be higher in pristine sites than in sites affected by agricultural activity. The relative abundance levels of Amoebozoa, some of which are natural hosts of legionellae, were similarly higher in pristine sites. Compared to other bacterial genera detected, Legionella had both the highest richness and highest alpha diversity. Our findings indicate that a highly diverse population of legionellae may be found in a variety of natural aquatic sources. Further characterization of these diverse natural populations of Legionella will help inform prevention and control efforts aimed at reducing the risk of Legionella colonization of built environments, which could ultimately decrease the risk of human disease. IMPORTANCE Many species of Legionella can cause Legionnaires' disease, a significant cause of bacterial pneumonia. Legionella in human-made water systems such as cooling towers and building plumbing systems are the primary sources of Legionnaires' disease outbreaks. In this temporal study of natural aquatic environments, Legionella relative abundance was shown to vary in watersheds associated with different land uses. Analysis of the Legionella sequences detected at these sites revealed highly diverse populations that included potentially novel Legionella species. These findings have important implications for understanding the ecology of Legionella and control measures for this pathogen that are aimed at reducing human disease.
RESUMEN
The adaptive significance of human brain evolution has been frequently studied through comparisons with other primates. However, the evolution of increased brain size is not restricted to the human lineage but is a general characteristic of primate evolution. Whether or not these independent episodes of increased brain size share a common genetic basis is unclear. We sequenced and de novo assembled the transcriptome from the neocortical tissue of the most highly encephalized nonhuman primate, the tufted capuchin monkey (Cebus apella). Using this novel data set, we conducted a genome-wide analysis of orthologous brain-expressed protein coding genes to identify evidence of conserved gene-phenotype associations and species-specific adaptations during three independent episodes of brain size increase. We identify a greater number of genes associated with either total brain mass or relative brain size across these six species than show species-specific accelerated rates of evolution in individual large-brained lineages. We test the robustness of these associations in an expanded data set of 13 species, through permutation tests and by analyzing how genome-wide patterns of substitution co-vary with brain size. Many of the genes targeted by selection during brain expansion have glutamatergic functions or roles in cell cycle dynamics. We also identify accelerated evolution in a number of individual capuchin genes whose human orthologs are associated with human neuropsychiatric disorders. These findings demonstrate the value of phenotypically informed genome analyses, and suggest at least some aspects of human brain evolution have occurred through conserved gene-phenotype associations. Understanding these commonalities is essential for distinguishing human-specific selection events from general trends in brain evolution.
Asunto(s)
Evolución Biológica , Encéfalo/anatomía & histología , Cebus/anatomía & histología , Primates/anatomía & histología , Animales , Encéfalo/fisiología , Cebus/genética , Estudios de Asociación Genética , Humanos , Filogenia , Primates/genética , Selección Genética , Especificidad de la EspecieRESUMEN
Mycoplasma pneumoniae is a significant cause of respiratory illness worldwide. Despite a minimal and highly conserved genome, genetic diversity within the species may impact disease. We performed whole genome sequencing (WGS) analysis of 107 M. pneumoniae isolates, including 67 newly sequenced using the Pacific BioSciences RS II and/or Illumina MiSeq sequencing platforms. Comparative genomic analysis of 107 genomes revealed >3,000 single nucleotide polymorphisms (SNPs) in total, including 520 type-specific SNPs. Population structure analysis supported the existence of six distinct subgroups, three within each type. We developed a predictive model to classify an isolate based on whole genome SNPs called against the reference genome into the identified subtypes, obviating the need for genome assembly. This study is the most comprehensive WGS analysis for M. pneumoniae to date, underscoring the power of combining complementary sequencing technologies to overcome difficult-to-sequence regions and highlighting potential differential genomic signatures in M. pneumoniae.
Asunto(s)
Biología Computacional , Genoma Bacteriano , Mycoplasma pneumoniae/genética , Técnicas de Tipificación Bacteriana , Teorema de Bayes , Análisis por Conglomerados , Variación Genética , Secuenciación de Nucleótidos de Alto Rendimiento , Mycoplasma pneumoniae/clasificación , Filogenia , Polimorfismo de Nucleótido Simple , Análisis de Secuencia de ADNRESUMEN
[This corrects the article DOI: 10.1371/journal.pone.0174701.].
RESUMEN
INTRODUCTION: The placenta is arguably the most anatomically variable organ in mammals even though its primary function is conserved. METHOD: Using RNA-Seq, we measured the expression profiles of 55 term placentas of 14 species of mammals representing all major eutherian superordinal clades and marsupials, and compared the evolution of expression across clades. RESULTS: We identified a set of 115 core genes which is expressed (FPKM ≥10) in all eutherian placentas, including genes with immune-modulating properties (ANXA2, ANXA1, S100A11, S100A10, and LGALS1), cell-cell interactions (LAMC1, LUM, and LGALS1), invasion (GRB2 and RALB) and syncytialization (ANXA5 and ANXA1). We also identified multiple pre-eclampsia associated genes which are differentially expressed in Homo sapiens when compared to the other 13 species. Multiple genes are significantly associated with placenta morphology, including EREG and WNT5A which are both associated with placental shape. DISCUSSION: 115 genes are important for the core functions of the placenta in all eutherian species analyzed. The molecular functions and pathways enriched in the core placenta align with the evolutionarily conserved functionality of the placenta.
Asunto(s)
Evolución Biológica , Mamíferos/metabolismo , Placenta/metabolismo , Transcriptoma , Actinas/metabolismo , Animales , Anexinas/metabolismo , Bovinos , Perros , Factor de Crecimiento Epidérmico/metabolismo , Femenino , Humanos , Mamíferos/anatomía & histología , Ratones , Placenta/anatomía & histología , Preeclampsia/genética , Preeclampsia/metabolismo , EmbarazoRESUMEN
Rupture of the extraembryonic membranes that form the gestational sac in humans is a typical feature of human parturition. However, preterm premature rupture of membranes (PPROM) occurs in approximately 1% of pregnancies, and is a leading cause of preterm birth. Conversely, retention of an intact gestational sac during parturition in the form of a caul is a rare occurrence. Understanding the molecular and evolutionary underpinnings of these disparate phenotypes can provide insight into both normal pregnancy and PPROM. Using phylogenetic techniques we reconstructed the evolution of the gestational sac phenotype at parturition in 55 mammal species representing all major viviparous mammal groups. We infer the ancestral state in therians, eutherians, and primates, as in humans, is a ruptured gestational sac at parturition. We present evidence that intact membranes at parturition have evolved convergently in diverse mammals including horses, elephants, and bats. In order to gain insight into the molecular underpinnings of the evolution of enhanced membrane integrity we also used comparative genomics techniques to reconstruct the evolution of a subset of genes implicated in PPROM, and find that four genes (ADAMTS2, COL1A1, COL5A1, LEPRE1) show significant evidence of increased nonsynonymous rates of substitution on lineages with intact membranes as compared to those with ruptured membranes. Among these genes, we also discovered that 17 human SNPs are associated with or near amino acid replacement sites in those mammals with intact membranes. These SNPs are candidate functional variants within humans, which may play roles in both PPROM and/or the retention of the gestational sac at birth.