Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 42
Filtrar
1.
DNA Res ; 31(3)2024 Jun 01.
Artigo em Inglês | MEDLINE | ID: mdl-38686638

RESUMO

Lodderomyces beijingensis is an ascosporic ascomycetous yeast. In contrast to related species Lodderomyces elongisporus, which is a recently emerging human pathogen, L. beijingensis is associated with insects. To provide an insight into its genetic makeup, we investigated the genome of its type strain, CBS 14171. We demonstrate that this yeast is diploid and describe the high contiguity nuclear genome assembly consisting of eight chromosome-sized contigs with a total size of about 15.1 Mbp. We find that the genome sequence contains multiple copies of the mating type loci and codes for essential components of the mating pheromone response pathway, however, the missing orthologs of several genes involved in the meiotic program raise questions about the mode of sexual reproduction. We also show that L. beijingensis genome codes for the 3-oxoadipate pathway enzymes, which allow the assimilation of protocatechuate. In contrast, the GAL gene cluster underwent a decay resulting in an inability of L. beijingensis to utilize galactose. Moreover, we find that the 56.5 kbp long mitochondrial DNA is structurally similar to known linear mitochondrial genomes terminating on both sides with covalently closed single-stranded hairpins. Finally, we discovered a new double-stranded RNA mycovirus from the Totiviridae family and characterized its genome sequence.


Assuntos
Cromossomos Fúngicos , Genes Fúngicos Tipo Acasalamento , Genoma Fúngico , Cromossomos Fúngicos/genética , Saccharomycetales/genética , Saccharomycetales/metabolismo
2.
bioRxiv ; 2023 Nov 22.
Artigo em Inglês | MEDLINE | ID: mdl-38045397

RESUMO

An annotation is a set of genomic intervals sharing a particular function or property. Examples include genes, conserved elements, and epigenetic modifications. A common task is to compare two annotations to determine if one is enriched or depleted in the regions covered by the other. We study the problem of assigning statistical significance to such a comparison based on a null model representing two random unrelated annotations. Previous approaches to this problem remain too slow or inaccurate. To incorporate more background information into such analyses and avoid biased results, we propose a new null model based on a Markov chain which differentiates among several genomic contexts. These contexts can capture various confounding factors, such as GC content or sequencing gaps. We then develop a new algorithm for estimating p-values by computing the exact expectation and variance of the test statistics and then estimating the p-value using a normal approximation. Compared to the previous algorithm by Gafurov et al., the new algorithm provides three advances: (1) the running time is improved from quadratic to linear or quasi-linear, (2) the algorithm can handle two different test statistics, and (3) the algorithm can handle both simple and context-dependent Markov chain null models. We demonstrate the efficiency and accuracy of our algorithm on synthetic and real data sets, including the recent human telomere-to-telomere assembly. In particular, our algorithm computed p-values for 450 pairs of human genome annotations using 24 threads in under three hours. The use of genomic contexts to correct for GC-bias also resulted in the reversal of some previously published findings. Availability: The software is freely available at https://github.com/fmfi-compbio/mcdp2 under the MIT licence. All data for reproducibility are available at https://github.com/fmfi-compbio/mcdp2-reproducibility.

3.
Front Microbiol ; 14: 1267695, 2023.
Artigo em Inglês | MEDLINE | ID: mdl-37869681

RESUMO

Identification of plasmids from sequencing data is an important and challenging problem related to antimicrobial resistance spread and other One-Health issues. We provide a new architecture for identifying plasmid contigs in fragmented genome assemblies built from short-read data. We employ graph neural networks (GNNs) and the assembly graph to propagate the information from nearby nodes, which leads to more accurate classification, especially for short contigs that are difficult to classify based on sequence features or database searches alone. We trained plASgraph2 on a data set of samples from the ESKAPEE group of pathogens. plASgraph2 either outperforms or performs on par with a wide range of state-of-the-art methods on testing sets of independent ESKAPEE samples and samples from related pathogens. On one hand, our study provides a new accurate and easy to use tool for contig classification in bacterial isolates; on the other hand, it serves as a proof-of-concept for the use of GNNs in genomics. Our software is available at https://github.com/cchauve/plasgraph2 and the training and testing data sets are available at https://github.com/fmfi-compbio/plasgraph2-datasets.

4.
Bioinformatics ; 39(6)2023 06 01.
Artigo em Inglês | MEDLINE | ID: mdl-37326967

RESUMO

MOTIVATION: Short tandem repeats (STRs) are regions of a genome containing many consecutive copies of the same short motif, possibly with small variations. Analysis of STRs has many clinical uses but is limited by technology mainly due to STRs surpassing the used read length. Nanopore sequencing, as one of long-read sequencing technologies, produces very long reads, thus offering more possibilities to study and analyze STRs. Basecalling of nanopore reads is however particularly unreliable in repeating regions, and therefore direct analysis from raw nanopore data is required. RESULTS: Here, we present WarpSTR, a novel method for characterizing both simple and complex tandem repeats directly from raw nanopore signals using a finite-state automaton and a search algorithm analogous to dynamic time warping. By applying this approach to determine the lengths of 241 STRs, we demonstrate that our approach decreases the mean absolute error of the STR length estimate compared to basecalling and STRique. AVAILABILITY AND IMPLEMENTATION: WarpSTR is freely available at https://github.com/fmfi-compbio/warpstr.


Assuntos
Nanoporos , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Genoma , Algoritmos , Repetições de Microssatélites , Análise de Sequência de DNA
5.
Bioinformatics ; 39(39 Suppl 1): i288-i296, 2023 06 30.
Artigo em Inglês | MEDLINE | ID: mdl-37387134

RESUMO

MOTIVATION: The analysis of bacterial isolates to detect plasmids is important due to their role in the propagation of antimicrobial resistance. In short-read sequence assemblies, both plasmids and bacterial chromosomes are typically split into several contigs of various lengths, making identification of plasmids a challenging problem. In plasmid contig binning, the goal is to distinguish short-read assembly contigs based on their origin into plasmid and chromosomal contigs and subsequently sort plasmid contigs into bins, each bin corresponding to a single plasmid. Previous works on this problem consist of de novo approaches and reference-based approaches. De novo methods rely on contig features such as length, circularity, read coverage, or GC content. Reference-based approaches compare contigs to databases of known plasmids or plasmid markers from finished bacterial genomes. RESULTS: Recent developments suggest that leveraging information contained in the assembly graph improves the accuracy of plasmid binning. We present PlasBin-flow, a hybrid method that defines contig bins as subgraphs of the assembly graph. PlasBin-flow identifies such plasmid subgraphs through a mixed integer linear programming model that relies on the concept of network flow to account for sequencing coverage, while also accounting for the presence of plasmid genes and the GC content that often distinguishes plasmids from chromosomes. We demonstrate the performance of PlasBin-flow on a real dataset of bacterial samples. AVAILABILITY AND IMPLEMENTATION: https://github.com/cchauve/PlasBin-flow.


Assuntos
Algoritmos , Genoma Bacteriano , Plasmídeos/genética , Movimento Celular , Bases de Dados Factuais
6.
Microbiol Resour Announc ; 12(3): e0000523, 2023 Mar 16.
Artigo em Inglês | MEDLINE | ID: mdl-36840572

RESUMO

Candida verbasci is an anamorphic ascomycetous yeast. We report the genome sequence of its type strain, 11-1055 (CBS 12699). The nuclear genome assembly consists of seven chromosome-sized contigs with a total size of 12.1 Mbp and has a relatively low G+C content (28.1%).

7.
BMC Bioinformatics ; 23(1): 551, 2022 Dec 19.
Artigo em Inglês | MEDLINE | ID: mdl-36536300

RESUMO

BACKGROUND: The genomes of SARS-CoV-2 are classified into variants, some of which are monitored as variants of concern (e.g. the Delta variant B.1.617.2 or Omicron variant B.1.1.529). Proportions of these variants circulating in a human population are typically estimated by large-scale sequencing of individual patient samples. Sequencing a mixture of SARS-CoV-2 RNA molecules from wastewater provides a cost-effective alternative, but requires methods for estimating variant proportions in a mixed sample. RESULTS: We propose a new method based on a probabilistic model of sequencing reads, capturing sequence diversity present within individual variants, as well as sequencing errors. The algorithm is implemented in an open source Python program called VirPool. We evaluate the accuracy of VirPool on several simulated and real sequencing data sets from both Illumina and nanopore sequencing platforms, including wastewater samples from Austria and France monitoring the onset of the Alpha variant. CONCLUSIONS: VirPool is a versatile tool for wastewater and other mixed-sample analysis that can handle both short- and long-read sequencing data. Our approach does not require pre-selection of characteristic mutations for variant profiles, it is able to use the entire length of reads instead of just the most informative positions, and can also capture haplotype dependencies within a single read.


Assuntos
COVID-19 , SARS-CoV-2 , Águas Residuárias , Humanos , RNA Viral , SARS-CoV-2/genética , SARS-CoV-2/isolamento & purificação , Águas Residuárias/virologia
8.
Bioinformatics ; 38(Suppl 1): i203-i211, 2022 06 24.
Artigo em Inglês | MEDLINE | ID: mdl-35758770

RESUMO

MOTIVATION: Genome annotations are a common way to represent genomic features such as genes, regulatory elements or epigenetic modifications. The amount of overlap between two annotations is often used to ascertain if there is an underlying biological connection between them. In order to distinguish between true biological association and overlap by pure chance, a robust measure of significance is required. One common way to do this is to determine if the number of intervals in the reference annotation that intersect the query annotation is statistically significant. However, currently employed statistical frameworks are often either inefficient or inaccurate when computing P-values on the scale of the whole human genome. RESULTS: We show that finding the P-values under the typically used 'gold' null hypothesis is NP-hard. This motivates us to reformulate the null hypothesis using Markov chains. To be able to measure the fidelity of our Markovian null hypothesis, we develop a fast direct sampling algorithm to estimate the P-value under the gold null hypothesis. We then present an open-source software tool MCDP that computes the P-values under the Markovian null hypothesis in O(m2+n) time and O(m) memory, where m and n are the numbers of intervals in the reference and query annotations, respectively. Notably, MCDP runtime and memory usage are independent from the genome length, allowing it to outperform previous approaches in runtime and memory usage by orders of magnitude on human genome annotations, while maintaining the same level of accuracy. AVAILABILITY AND IMPLEMENTATION: The software is available at https://github.com/fmfi-compbio/mc-overlaps. All data for reproducibility are available at https://github.com/fmfi-compbio/mc-overlaps-reproducibility. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Genoma Humano , Software , Ouro , Humanos , Cadeias de Markov , Reprodutibilidade dos Testes
9.
PLoS Genet ; 18(3): e1009815, 2022 03.
Artigo em Inglês | MEDLINE | ID: mdl-35255079

RESUMO

Many fungal species utilize hydroxyderivatives of benzene and benzoic acid as carbon sources. The yeast Candida parapsilosis metabolizes these compounds via the 3-oxoadipate and gentisate pathways, whose components are encoded by two metabolic gene clusters. In this study, we determine the chromosome level assembly of the C. parapsilosis strain CLIB214 and use it for transcriptomic and proteomic investigation of cells cultivated on hydroxyaromatic substrates. We demonstrate that the genes coding for enzymes and plasma membrane transporters involved in the 3-oxoadipate and gentisate pathways are highly upregulated and their expression is controlled in a substrate-specific manner. However, regulatory proteins involved in this process are not known. Using the knockout mutants, we show that putative transcriptional factors encoded by the genes OTF1 and GTF1 located within these gene clusters function as transcriptional activators of the 3-oxoadipate and gentisate pathway, respectively. We also show that the activation of both pathways is accompanied by upregulation of genes for the enzymes involved in ß-oxidation of fatty acids, glyoxylate cycle, amino acid metabolism, and peroxisome biogenesis. Transcriptome and proteome profiles of the cells grown on 4-hydroxybenzoate and 3-hydroxybenzoate, which are metabolized via the 3-oxoadipate and gentisate pathway, respectively, reflect their different connection to central metabolism. Yet we find that the expression profiles differ also in the cells assimilating 4-hydroxybenzoate and hydroquinone, which are both metabolized in the same pathway. This finding is consistent with the phenotype of the Otf1p-lacking mutant, which exhibits impaired growth on hydroxybenzoates, but still utilizes hydroxybenzenes, thus indicating that additional, yet unidentified transcription factor could be involved in the 3-oxoadipate pathway regulation. Moreover, we propose that bicarbonate ions resulting from decarboxylation of hydroxybenzoates also contribute to differences in the cell responses to hydroxybenzoates and hydroxybenzenes. Finally, our phylogenetic analysis highlights evolutionary paths leading to metabolic adaptations of yeast cells assimilating hydroxyaromatic substrates.


Assuntos
Candida parapsilosis , Gentisatos , Candida parapsilosis/metabolismo , Carbono , Gentisatos/metabolismo , Hidroxibenzoatos/metabolismo , Filogenia , Proteoma/genética , Proteômica , Saccharomyces cerevisiae/metabolismo , Transcriptoma/genética
10.
IEEE/ACM Trans Comput Biol Bioinform ; 19(6): 3416-3424, 2022.
Artigo em Inglês | MEDLINE | ID: mdl-34784283

RESUMO

In nanopore sequencing, electrical signal is measured as DNA molecules pass through the sequencing pores. Translating these signals into DNA bases (base calling) is a highly non-trivial task, and its quality has a large impact on the sequencing accuracy. The most successful nanopore base callers to date use convolutional neural networks (CNN) to accomplish the task. Convolutional layers in CNNs are typically composed of filters with constant window size, performing best in analysis of signals with uniform speed. However, the speed of nanopore sequencing varies greatly both within reads and between sequencing runs. Here, we present dynamic pooling, a novel neural network component, which addresses this problem by adaptively adjusting the pooling ratio. To demonstrate the usefulness of dynamic pooling, we developed two base callers: Heron and Osprey. Heron improves the accuracy beyond the experimental high-accuracy base caller Bonito developed by Oxford Nanopore. Osprey is a fast base caller that can compete in accuracy with Guppy high-accuracy mode, but does not require GPU acceleration and achieves a near real-time speed on common desktop CPUs. Availability: https://github.com/fmfi-compbio/osprey, https://github.com/fmfi-compbio/heron.


Assuntos
Nanoporos , Software , Análise de Sequência de DNA , Sequenciamento de Nucleotídeos em Larga Escala , DNA/genética
11.
Sci Rep ; 11(1): 20494, 2021 10 14.
Artigo em Inglês | MEDLINE | ID: mdl-34650153

RESUMO

The emergence of a novel SARS-CoV-2 B.1.1.7 variant sparked global alarm due to increased transmissibility, mortality, and uncertainty about vaccine efficacy, thus accelerating efforts to detect and track the variant. Current approaches to detect B.1.1.7 include sequencing and RT-qPCR tests containing a target assay that fails or results in reduced sensitivity towards the B.1.1.7 variant. Since many countries lack genomic surveillance programs and failed assays detect unrelated variants containing similar mutations as B.1.1.7, we used allele-specific PCR, and judicious placement of LNA-modified nucleotides to develop an RT-qPCR test that accurately and rapidly differentiates B.1.1.7 from other SARS-CoV-2 variants. We validated the test on 106 clinical samples with lineage status confirmed by sequencing and conducted a country-wide surveillance study of B.1.1.7 prevalence in Slovakia. Our multiplexed RT-qPCR test showed 97% clinical sensitivity and retesting 6,886 SARS-CoV-2 positive samples obtained during three campaigns performed within one month, revealed pervasive spread of B.1.1.7 with an average prevalence of 82%. Labs can easily implement this test to rapidly scale B.1.1.7 surveillance efforts and it is particularly useful in countries with high prevalence of variants possessing only the ΔH69/ΔV70 deletion because current strategies using target failure assays incorrectly identify these as putative B.1.1.7 variants.


Assuntos
Teste de Ácido Nucleico para COVID-19/métodos , COVID-19/diagnóstico , COVID-19/virologia , Reação em Cadeia da Polimerase Multiplex/métodos , SARS-CoV-2/genética , Alelos , COVID-19/epidemiologia , Humanos , Mutação , Prevalência , RNA Viral/genética , SARS-CoV-2/isolamento & purificação , Eslováquia/epidemiologia
12.
PLoS One ; 16(10): e0259277, 2021.
Artigo em Inglês | MEDLINE | ID: mdl-34714886

RESUMO

Surveillance of the SARS-CoV-2 variants including the quickly spreading mutants by rapid and near real-time sequencing of the viral genome provides an important tool for effective health policy decision making in the ongoing COVID-19 pandemic. Here we evaluated PCR-tiling of short (~400-bp) and long (~2 and ~2.5-kb) amplicons combined with nanopore sequencing on a MinION device for analysis of the SARS-CoV-2 genome sequences. Analysis of several sequencing runs demonstrated that using the long amplicon schemes outperforms the original protocol based on the 400-bp amplicons. It also illustrated common artefacts and problems associated with PCR-tiling approach, such as uneven genome coverage, variable fraction of discarded sequencing reads, including human and bacterial contamination, as well as the presence of reads derived from the viral sub-genomic RNAs.


Assuntos
COVID-19/diagnóstico , Sequenciamento por Nanoporos/métodos , Pandemias , SARS-CoV-2 , Humanos , SARS-CoV-2/genética , SARS-CoV-2/isolamento & purificação
13.
Virus Genes ; 57(6): 556-560, 2021 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-34448987

RESUMO

SARS-CoV-2 mutants carrying the ∆H69/∆V70 deletion in the amino-terminal domain of the Spike protein emerged independently in at least six lineages of the virus (namely, B.1.1.7, B.1.1.298, B.1.160, B.1.177, B.1.258, B.1.375). We analyzed SARS-CoV-2 samples collected from various regions of Slovakia between November and December 2020 that were presumed to contain B.1.1.7 variant due to drop-out of the Spike gene target in an RT-qPCR test caused by this deletion. Sequencing of these samples revealed that although in some cases the samples were indeed confirmed as B.1.1.7, a substantial fraction of samples contained another ∆H69/∆V70 carrying mutant belonging to the lineage B.1.258, which has been circulating in Central Europe since August 2020, long before the import of B.1.1.7. Phylogenetic analysis shows that the early sublineage of B.1.258 acquired the N439K substitution in the receptor-binding domain (RBD) of the Spike protein and, later on, also the deletion ∆H69/∆V70 in the Spike N-terminal domain (NTD). This variant was particularly common in several European countries including the Czech Republic and Slovakia but has been quickly replaced by B.1.1.7 early in 2021.


Assuntos
COVID-19/epidemiologia , COVID-19/virologia , Filogenia , SARS-CoV-2/genética , SARS-CoV-2/isolamento & purificação , Deleção de Sequência , Glicoproteína da Espícula de Coronavírus/genética , Europa (Continente)/epidemiologia , Humanos , SARS-CoV-2/classificação , Fatores de Tempo
14.
Bioinformatics ; 37(24): 4661-4667, 2021 12 11.
Artigo em Inglês | MEDLINE | ID: mdl-34314502

RESUMO

MOTIVATION: MinION is a portable nanopore sequencing device that can be easily operated in the field with features including monitoring of run progress and selective sequencing. To fully exploit these features, real-time base calling is required. Up to date, this has only been achieved at the cost of high computing requirements that pose limitations in terms of hardware availability in common laptops and energy consumption. RESULTS: We developed a new base caller DeepNano-coral for nanopore sequencing, which is optimized to run on the Coral Edge Tensor Processing Unit, a small USB-attached hardware accelerator. To achieve this goal, we have designed new versions of two key components used in convolutional neural networks for speech recognition and base calling. In our components, we propose a new way of factorization of a full convolution into smaller operations, which decreases memory access operations, memory access being a bottleneck on this device. DeepNano-coral achieves real-time base calling during sequencing with the accuracy slightly better than the fast mode of the Guppy base caller and is extremely energy efficient, using only 10 W of power. AVAILABILITY AND IMPLEMENTATION: https://github.com/fmfi-compbio/coral-basecaller. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Nanoporos , Software , Análise de Sequência de DNA , Sequenciamento de Nucleotídeos em Larga Escala , Redes Neurais de Computação
15.
Bioinformatics ; 36(14): 4191-4192, 2020 08 15.
Artigo em Inglês | MEDLINE | ID: mdl-32374816

RESUMO

MOTIVATION: Oxford Nanopore MinION is a portable DNA sequencer that is marketed as a device that can be deployed anywhere. Current base callers, however, require a powerful GPU to analyze data produced by MinION in real time, which hampers field applications. RESULTS: We have developed a fast base caller DeepNano-blitz that can analyze stream from up to two MinION runs in real time using a common laptop CPU (i7-7700HQ), with no GPU requirements. The base caller settings allow trading accuracy for speed and the results can be used for real time run monitoring (i.e. sample composition, barcode balance, species identification, etc.) or prefiltering of results for more detailed analysis (i.e. filtering out human DNA from human-pathogen runs). AVAILABILITY AND IMPLEMENTATION: DeepNano-blitz has been developed and tested on Linux and Intel processors and is available under MIT license at https://github.com/fmfi-compbio/deepnano-blitz. CONTACT: vladimir.boza@fmph.uniba.sk. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Nanoporos , DNA , Sequenciamento de Nucleotídeos em Larga Escala , Humanos , Análise de Sequência de DNA , Software
16.
Gigascience ; 9(1)2020 01 01.
Artigo em Inglês | MEDLINE | ID: mdl-31942620

RESUMO

BACKGROUND: The giant squid (Architeuthis dux; Steenstrup, 1857) is an enigmatic giant mollusc with a circumglobal distribution in the deep ocean, except in the high Arctic and Antarctic waters. The elusiveness of the species makes it difficult to study. Thus, having a genome assembled for this deep-sea-dwelling species will allow several pending evolutionary questions to be unlocked. FINDINGS: We present a draft genome assembly that includes 200 Gb of Illumina reads, 4 Gb of Moleculo synthetic long reads, and 108 Gb of Chicago libraries, with a final size matching the estimated genome size of 2.7 Gb, and a scaffold N50 of 4.8 Mb. We also present an alternative assembly including 27 Gb raw reads generated using the Pacific Biosciences platform. In addition, we sequenced the proteome of the same individual and RNA from 3 different tissue types from 3 other species of squid (Onychoteuthis banksii, Dosidicus gigas, and Sthenoteuthis oualaniensis) to assist genome annotation. We annotated 33,406 protein-coding genes supported by evidence, and the genome completeness estimated by BUSCO reached 92%. Repetitive regions cover 49.17% of the genome. CONCLUSIONS: This annotated draft genome of A. dux provides a critical resource to investigate the unique traits of this species, including its gigantism and key adaptations to deep-sea environments.


Assuntos
Decapodiformes/genética , Genoma , Genômica , Animais , Evolução Biológica , Cromatografia Líquida , Biologia Computacional/métodos , Elementos de DNA Transponíveis , Perfilação da Expressão Gênica , Genômica/métodos , Anotação de Sequência Molecular , Família Multigênica , RNA não Traduzido , Espectrometria de Massas em Tandem , Transcriptoma , Sequenciamento Completo do Genoma
17.
Microbiol Resour Announc ; 8(50)2019 Dec 12.
Artigo em Inglês | MEDLINE | ID: mdl-31831616

RESUMO

Chromosome-scale genome assembly of the yeast Saprochaete ingens CBS 517.90 was determined by a combination of technologies producing short (HiSeq X; Illumina) and long (MinION; Oxford Nanopore Technologies) reads. The 21.2-Mbp genome sequence has a GC content of 36.9% and codes for 6,475 predicted proteins.

18.
Evol Bioinform Online ; 15: 1176934319849071, 2019.
Artigo em Inglês | MEDLINE | ID: mdl-31210725

RESUMO

Computing similarity between 2 nucleotide sequences is one of the fundamental problems in bioinformatics. Current methods are based mainly on 2 major approaches: (1) sequence alignment, which is computationally expensive, and (2) faster, but less accurate, alignment-free methods based on various statistical summaries, for example, short word counts. We propose a new distance measure based on mathematical transforms from the domain of signal processing. To tolerate large-scale rearrangements in the sequences, the transform is computed across sliding windows. We compare our method on several data sets with current state-of-art alignment-free methods. Our method compares favorably in terms of accuracy and outperforms other methods in running time and memory requirements. In addition, it is massively scalable up to dozens of processing units without the loss of performance due to communication overhead. Source files and sample data are available at https://bitbucket.org/fiitstubioinfo/swspm/src.

19.
Microbiol Resour Announc ; 8(15)2019 Apr 11.
Artigo em Inglês | MEDLINE | ID: mdl-30975801

RESUMO

Saprochaete fungicola is an arthroconidial yeast classified in the Magnusiomyces/Saprochaete clade of the subphylum Saccharomycotina. Here, we report the genome sequence of holotype strain CBS 625.85, assembled to five putative chromosomes. The genome sequence is 20.2 Mbp long and codes for 6,138 predicted proteins.

20.
Artigo em Inglês | MEDLINE | ID: mdl-30834381

RESUMO

Saprochaete suaveolens is an ascomycetous yeast that produces a range of fruity flavors and fragrances. Here, we report the high-contiguity genome sequence of the ex-holotype strain, NRRL Y-17571 (CBS 152.25). The nuclear genome sequence contains 24.4 Mbp and codes for 8,119 predicted proteins.

SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA