Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 103
Filtrar
Mais filtros

Base de dados
Tipo de documento
Intervalo de ano de publicação
1.
Bioinformatics ; 40(4)2024 Mar 29.
Artigo em Inglês | MEDLINE | ID: mdl-38498847

RESUMO

MOTIVATION: Proteoform identification is an important problem in proteomics. The main task is to find a modified protein that best fits the input spectrum. To overcome the combinatorial explosion of possible proteoforms, the proteoform mass graph and spectrum mass graph are used to represent the protein database and the spectrum, respectively. The problem becomes finding an optimal alignment between the proteoform mass graph and the spectrum mass graph. Peak error correction is an important issue for computing an optimal alignment between the two input mass graphs. RESULTS: We propose a faster algorithm for the error correction alignment of spectrum mass graph and proteoform mass graph problem and produce a program package TopMGFast. The newly designed algorithms require less space and running time so that we are able to compute global optimal alignments for the two input mass graphs in a reasonable time. For the local alignment version, experiments show that the running time of the new algorithm is reduced by 2.5 times. For the global alignment version, experiments show that the maximum mass errors between any pair of matched nodes in the alignments obtained by our method are within a small range as designed, while the alignments produced by the state-of-the-art method, TopMG, have very large maximum mass errors for many cases. The obtained alignment sizes are roughly the same for both TopMG and TopMGFast. Of course, TopMGFast needs more running time than TopMG. Therefore, our new algorithm can obtain more reliable global alignments within a reasonable time. This is the first time that global optimal error correction alignments can be obtained using real datasets. AVAILABILITY AND IMPLEMENTATION: The source code of the algorithm is available at https://github.com/Zeirdo/TopMGFast.


Assuntos
Processamento de Proteína Pós-Traducional , Proteoma , Proteoma/metabolismo , Algoritmos , Espectrometria de Massas em Tandem , Software
2.
Brief Bioinform ; 23(2)2022 03 10.
Artigo em Inglês | MEDLINE | ID: mdl-35136947

RESUMO

In this paper, we study the problem for finding complex proteoforms from protein databases based on top-down tandem mass spectrum data. The main difficulty to solve the problem is to handle the combinatorial explosion of various alterations on a protein. To overcome the combinatorial explosion of various alterations on a protein, the problem has been formulated as the alignment problem of a proteoform mass graph (PMG) and a spectrum mass graph (SMG). The other important issue is to handle mass errors of peaks in the input spectrum. In previous methods, an error tolerance value is used to handle the mass differences between the matched consecutive nodes/peaks in PMG and SMG. However, such a way to handle mass error can not guarantee that the mass difference between any pairs of nodes in the alignment is approximately the same for both PMG and SMG. It may lead to large error accumulation if positive (or negative) errors occur consecutively for a large number of consecutive matched node pairs. The problem is severe so that some existing software packages include a step to further refine the alignments. In this paper, we propose a new model to handle the mass errors of peaks based on the formulation of the PMG and SMG. Note that the masses of sub-paths on the PMG are theoretical and suppose to be accurate. Our method allows each peak in the input spectrum to have a predefined error range. In the alignment of PMG and SMG, we need to give a correction of the mass for each matched peak within the predefined error range. After the correction, we impose that the mass between any two (not necessarily consecutive) matched nodes in the PMG is identical to that of the corresponding two matched peaks in the SMG. Intuitively, this kind of alignment is more accurate. We design an algorithm to find a maximum number of matched node and peak pairs in the two (PMG and SMG) mass graphs under the new constraint. The obtained alignment can show matched node and peak pairs as well as the corrected positions of peaks. The algorithm works well for moderate size input instances and takes very long time as well as huge size memory for large input size instances. Therefore, we propose an algorithm to do diagonal alignment. The diagonal alignment algorithm can solve large input size instances in reasonable time. Experiments show that our new algorithms can report alignments with much larger number of matched node pairs. The software package and test data sets are available at https://github.com/Zeirdo/TopMGRefine.


Assuntos
Algoritmos , Espectrometria de Massas em Tandem , Bases de Dados de Proteínas , Software , Espectrometria de Massas em Tandem/métodos
3.
Bioinformatics ; 38(8): 2333-2340, 2022 04 12.
Artigo em Inglês | MEDLINE | ID: mdl-35171986

RESUMO

MOTIVATION: Drawing peaks in a data window of an MS dataset happens at all time in MS data visualization applications. This asks to retrieve from an MS dataset some selected peaks in a data window whose image in a display window reflects the visual feature of all peaks in the data window. If an algorithm for this purpose is asked to output high-quality solutions in real time, then the most fundamental dependence of it is on the storage format of the MS dataset. RESULTS: We present mzMD, a new storage format of MS datasets and an algorithm to query this format of a storage system for a summary (a set of selected representative peaks) of a given data window. We propose a criterion Q-score to examine the quality of data window summaries. Experimental statistics on real MS datasets verified the high speed of mzMD in retrieving high-quality data window summaries. mzMD reported summaries of data windows whose Q-score outperforms those mzTree reported. The query speed of mzMD is the same as that of mzTree whereas its query speed stability is better than that of mzTree. AVAILABILITY AND IMPLEMENTATION: The source code is freely available at https://github.com/yrm9837/mzMD-java. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Algoritmos , Software , Armazenamento e Recuperação da Informação , Visualização de Dados , Confiabilidade dos Dados
4.
J Med Genet ; 59(3): 230-236, 2022 03.
Artigo em Inglês | MEDLINE | ID: mdl-33397747

RESUMO

High-quality interpretation of BRCA1/2 variants plays a critical role in the clinical practice of precision medicine. However, a comprehensive system to evaluate the quality and accuracy of variant interpretation has yet to be established. This study investigates the performance of an interpretation system in evaluating the capacities of BRCA1/2 interpretation among distinct laboratories in China. The evaluation system is based on a reference database that contains 750 different variants in BRCA1/2 Evaluation was performed among 41 laboratories in China. We classified their performance into five levels. Only level A was considered qualified. This level allows for a 0.3% error rate for clinical decision-related misinterpretation; 26 of 41 laboratories (63%) met the qualified standard, while 7 laboratories were at levels D and E, which indicated egregious mistakes and systemic problems in variant interpretation. Due to strict quality demands, the interpretation of several variants was amended, which largely influenced the quality rate. The number of qualified laboratories would decrease from 26 to 17 if those incorrect recommended interpretations were not corrected. This evaluation system provides a potential approach for standardisation of variant interpretation and lowers the discordance of variant interpretation between different laboratories. A well-designed interpretation ability evaluation is essential to evaluate the interpretation level of laboratories before they provide service in real-world clinical settings.


Assuntos
Testes Genéticos , Laboratórios , Proteína BRCA1/genética , China , Variação Genética , Humanos
5.
J Asian Nat Prod Res ; 25(8): 731-740, 2023.
Artigo em Inglês | MEDLINE | ID: mdl-36448521

RESUMO

AbstactA total of 16 fungal strains were isolated from fresh leaves and flowers of Magnolia grandiflora and the EtOAc extracts of them were assayed for antitumor activities. Among these, the fungus Dothideomycetes sp. BMC-101 with broad spectrum inhibition was selected for further study. Four alkaloids (1-4) including two new compounds (2-(hydroxyimino)-3-phenylpropanoyl)-L-phenylalanine (1) and 8-Acetyl-bisdethiobis(methylsulfanyl)apoaranotin (4)) were isolated from Dothideomycetes sp. BMC-101. The structure of 1 was characterized with an oxime moiety formed by the condensation of two phenylalanines. To our knowledge, this is the first report on a fungal phenylalanine derivative with an oxime moiety.

6.
J Nat Prod ; 85(12): 2789-2795, 2022 12 23.
Artigo em Inglês | MEDLINE | ID: mdl-36480660

RESUMO

Four new bisanthraquinones, dothideomins A-D (1-4), were identified from Dothideomycetes sp. BMC-101, an endophytic fungus isolated from Magnolia grandiflora L. leaves. Their chemical structures were established by NMR analysis, single-crystal X-ray crystallography, and ECD analysis. Dothideomins A-D (1-4) were characterized by an unusual 6/6/6/5/6/3/6/6 octocyclic scaffold (1 and 2) and a 6/6/6/5/6/6/6 heptacyclic scaffold (3 and 4), respectively. All compounds, especially 1 and 3, exhibited potent antibacterial activity with MIC values ranging from 0.4 to 0.8 µg/mL.


Assuntos
Antibacterianos , Ascomicetos , Antibacterianos/química , Ascomicetos/química , Cristalografia por Raios X , Espectroscopia de Ressonância Magnética , Estrutura Molecular
7.
Environ Microbiol ; 23(7): 3599-3613, 2021 07.
Artigo em Inglês | MEDLINE | ID: mdl-32939951

RESUMO

Thermococcales has a strong adaptability to extreme environments, which is of profound interest in explaining how complex life forms emerge on earth. However, their gene composition, thermal stability and evolution in hyperthermal environments are still little known. Here, we characterized the pan-genome architecture of 30 Thermococcales species to gain insight into their genetic properties, evolutionary patterns and specific metabolisms adapted to niches. We revealed an open pan-genome of Thermococcales comprising 6070 gene families that tend to increase with the availability of additional genomes. The genome contents of Thermococcales were flexible, with a series of genes experienced gene duplication, progressive divergence, or gene gain and loss events exhibiting distinct functional features. These archaea had concise types of heat shock proteins, such as HSP20, HSP60 and prefoldin, which were constrained by strong purifying selection that governed their conservative evolution. Furthermore, purifying selection forced genes involved in enzyme, motility, secretion system, defence system and chaperones to differ in functional constraints and their disparity in the rate of evolution may be related to adaptation to specific niche. These results deepened our understanding of genetic diversity and adaptation patterns of Thermococcales, and provided valuable research models for studying the metabolic traits of early life forms.


Assuntos
Thermococcales , Adaptação Fisiológica/genética , Evolução Molecular , Duplicação Gênica , Genoma , Humanos , Filogenia , Thermococcales/genética
8.
Breast Cancer Res Treat ; 189(2): 533-539, 2021 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-34196900

RESUMO

PURPOSE: Mutations in hereditary breast cancer genes play an important role in the risk for cancer. METHODS: Cancer susceptibility genes were sequenced in 664 unselected breast cancer cases from Guatemala. Variants were annotated with ClinVar and VarSome. RESULTS: A total of 73 out of 664 subjects (11%) had a pathogenic variant in a high or moderate penetrance gene. The most frequently mutated genes were BRCA1 (37/664, 5.6%) followed by BRCA2 (15/664, 2.3%), PALB2 (5/664, 0.8%), and TP53 (5/664, 0.8%). Pathogenic variants were also detected in the moderate penetrance genes ATM, BARD1, CHEK2, and MSH6. The high ratio of BRCA1/BRCA2 mutations is due to two potential founder mutations: BRCA1 c.212 + 1G > A splice mutation (15 cases) and BRCA1 c.799delT (9 cases). Cases with pathogenic mutations had a significantly earlier age at diagnosis (45 vs 51 years, P < 0.001), are more likely to have had diagnosis before menopause, and a higher percentage had a relative with any cancer (51% vs 37%, P = 0.038) or breast cancer (33% vs 15%, P < 0.001). CONCLUSIONS: Hereditary breast cancer mutations were observed among Guatemalan women, and these women are more likely to have early age at diagnosis and family history of cancer. These data suggest the use of genetic testing in breast cancer patients and those at high risk as part of a strategy to reduce breast cancer mortality in Guatemala.


Assuntos
Neoplasias da Mama , Predisposição Genética para Doença , Mutação em Linhagem Germinativa , Proteína BRCA1/genética , Proteína BRCA2/genética , Neoplasias da Mama/diagnóstico , Neoplasias da Mama/epidemiologia , Neoplasias da Mama/genética , Feminino , Genes BRCA2 , Células Germinativas , Guatemala , Humanos
9.
J Fluoresc ; 30(4): 883-890, 2020 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-32494936

RESUMO

Based on boron-dipyrromethene (BODIPY), taking 2-hydroxy-N-(2-hydroxyphenyl)benzamide as recognition site, a new fluorescent probe HHPBA-BODIPY aimed at sensitively detecting Cu ions was designed, synthesized and characterized.The emission spectra of HHPBA-BODIPY exhibited an intensive green fluorescence around 510 nm, with a maximum absorption near 500 nm. When Cu2+ ions are present, the fluorescence at 510 nm can be quenched with a good linearity between the copper ion concentrationand the fluorescence intensity and the detection limit is 0.35 µM. HHPBA-BODIPY is also selective toward Cu2+, while other metal ions show no interfere except Fe3+ and Cr3+ ions. In addition, HHPBA-BODIPY also proved efficient to detect Cu2+ in water samples which offers the possibility to detect trace amount of Cu2+ for environmental monitoring. Copper ions; BODIPY; fluorescent probe.

10.
Bioinformatics ; 34(12): 2012-2018, 2018 06 15.
Artigo em Inglês | MEDLINE | ID: mdl-29474523

RESUMO

Motivation: Haplotype information is essential to the complete description and interpretation of genomes, genetic diversity and genetic ancestry. The new technologies can provide Single Molecular Sequencing (SMS) data that cover about 90% of positions over chromosomes. However, the SMS data has a higher error rate comparing to 1% error rate for short reads. Thus, it becomes very difficult for SNP calling and haplotype assembly using SMS reads. Most existing technologies do not work properly for the SMS data. Results: In this paper, we develop a progressive approach for SNP calling and haplotype assembly that works very well for the SMS data. Our method can handle more than 200 million non-N bases on Chromosome 1 with millions of reads, more than 100 blocks, each of which contains more than 2 million bases and more than 3K SNP sites on average. Experiment results show that the false discovery rate and false negative rate for our method are 15.7 and 11.0% on NA12878, and 16.5 and 11.0% on NA24385. Moreover, the overall switch errors for our method are 7.26 and 5.21 with average 3378 and 5736 SNP sites per block on NA12878 and NA24385, respectively. Here, we demonstrate that SMS reads alone can generate a high quality solution for both SNP calling and haplotype assembly. Availability and implementation: Source codes and results are available at https://github.com/guofeieileen/SMRT/wiki/Software.


Assuntos
Haplótipos , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Polimorfismo de Nucleotídeo Único , Análise de Sequência de DNA/métodos , Software , Algoritmos , Humanos , Dados de Sequência Molecular
11.
Ophthalmology ; 126(11): 1549-1556, 2019 11.
Artigo em Inglês | MEDLINE | ID: mdl-31054281

RESUMO

PURPOSE: To characterize the genetic landscape of patients with suspected retinitis pigmentosa (RP) in the Chinese population. DESIGN: Cohort study. PARTICIPANTS: A total of 1243 patients of Chinese origin with clinically suspected RP and their available family members (n = 2701) were recruited. METHODS: All patients and available family members were screened using multigene panel testing (including 586 eye disease-associated genes), followed by clinical variant interpretation. MAIN OUTCOME MEASURES: Diagnostic yield, the 17 most commonly implicated genes, age at onset, de novo mutations, and clinical usefulness of genetic testing. RESULTS: Overall, 72.08% of patients received a molecular diagnosis, and the 17 top genes covered 75.63% of diagnostic cases. Diagnostic yield was higher among patients in the early-onset subgroup (≤5 years old, 79.58%) than in the childhood or adolescence-onset subgroup (6-16 years old, 73.74%) and late-onset subgroup (≥17 years old, 65.99%). Moreover, different genes associated with different onset ages and subgroups with different onset ages showed a diverse mutation spectrum. Only 11 de novo mutations (3.18%) were identified. Furthermore, 16.84% of the patients who received a molecular diagnosis had refinement of the initial clinical diagnoses, and the remaining 83.16% received definite genetic subtypes of RP. CONCLUSIONS: This large cohort study provides population-based data of the genome landscape of patients with suspected RP in China. The diagnostic yield was significantly higher than that in previous studies, and the mutation spectrum is completely different with other populations. Genetic testing improves the chance to establish a precise diagnosis, identifies features not previously determined, and allows a more accurate refinement of risk to family members. Our results not only expand the existing genotypic spectrum but also serve as an efficient reference for the design of panel-based genetic diagnostic testing and genetic counseling for patients with suspected RP in China.


Assuntos
Povo Asiático/genética , Proteínas do Olho/genética , Retinose Pigmentar/genética , Adolescente , Adulto , Idade de Início , Idoso , Idoso de 80 Anos ou mais , Criança , Pré-Escolar , China/epidemiologia , Estudos de Coortes , Análise Mutacional de DNA , Feminino , Estudos de Associação Genética , Testes Genéticos , Sequenciamento de Nucleotídeos em Larga Escala , Humanos , Lactente , Masculino , Pessoa de Meia-Idade , Mutação , Retinose Pigmentar/diagnóstico
12.
Ren Fail ; 41(1): 842-849, 2019 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-31488014

RESUMO

Purpose: Autosomal dominant polycystic kidney disease (ADPKD) is characterized by progressive development of kidney cysts and enlargement and dysfunction of the kidneys. The Consortium of Radiologic Imaging Studies of the Polycystic Kidney Disease (CRISP) cohort revealed that 89.1% had either a PKD1 or PKD2 mutation. Of the CRISP patients with a genetic cause detected, mutations in PKD1 accounted for 85%, while mutations in the PKD2 accounted for the remaining 15%. Here, we report exome sequencing of 16 Saudi patients diagnosed with ADPKD and 16 ethnically matched controls. Methods: Exome sequencing was performed using combinatorial probe-anchor synthesis and improved DNA Nanoballs technology on BGISEQ-500 sequencers (BGI, China) using the BGI Exome V4 (59 Mb) Kit. Identified variants were validated with Sanger sequencing. Results: With the exception of GC-rich exon 1, we obtained excellent coverage of PKD1 (mean read depth = 88) including both duplicated and non-duplicated regions. Of nine patients with typical ADPKD presentations (bilateral symmetrical kidney involvement, positive family history, concordant imaging, and kidney function), four had protein truncating PKD1 mutations, one had a PKD1 missense mutation, and one had a PKD2 mutation. These variants have not been previously observed in the Saudi population. In seven clinically diagnosed ADPKD cases but with atypical features, no PKD1 or PKD2 mutations were identified, but rare predicted pathogenic heterozygous variants were found in cystogenic candidate genes including PKHD1, PKD1L3, EGF, CFTR, and TSC2. Conclusions: Mutations in PKD1 and PKD2 are the most common cause of ADPKD in Saudi patients with typical ADPKD. Abbreviations: ADPKD: Autosomal dominant polycystic kidney disease; CFTR: Cystic fibrosis transmembrane conductance regulator; EGF: Epidermal growth factor; MCIC: Mayo Clinic Imaging Classification; PKD: Polycystic kidney disease; TSC2: Tuberous sclerosis complex 2.


Assuntos
Rim Policístico Autossômico Dominante/genética , Adulto , Idoso , Árabes/genética , Canais de Cálcio/genética , Estudos de Casos e Controles , Regulador de Condutância Transmembrana em Fibrose Cística/genética , Análise Mutacional de DNA , Fator de Crescimento Epidérmico/genética , Éxons/genética , Feminino , Humanos , Masculino , Pessoa de Meia-Idade , Mutação de Sentido Incorreto , Rim Policístico Autossômico Dominante/diagnóstico por imagem , Receptores de Superfície Celular/genética , Arábia Saudita , Canais de Cátion TRPP/genética , Tomografia Computadorizada por Raios X , Proteína 2 do Complexo Esclerose Tuberosa/genética , Sequenciamento do Exoma
13.
Sensors (Basel) ; 19(18)2019 Sep 05.
Artigo em Inglês | MEDLINE | ID: mdl-31492027

RESUMO

Physiological information such as respiratory rate and heart rate in the sleep state can be used to evaluate the health condition of the sleeper. Traditional sleep monitoring systems need body contact and are intrusive, which limits their applicability. Thus, a comfortable sleep biosignals detection system with both high accuracy and low cost is important for health care. In this paper, we design a sleep biosignals detection system based on low-cost piezoelectric ceramic sensors. 18 piezoelectric ceramic sensors are deployed under the mattress to capture the pressure data. The appropriate sensor that captures respiration and heartbeat sensitively is selected by the proposed channel-selection algorithm. Then, we propose a dynamic smoothing algorithm to extract respiratory rate and heart rate using the selected data. The dynamic smoothing can separate heartbeat signals from respiratory signals with low complexity by dynamically choosing the smooth window, and it is suitable for real-time implementation in low-cost embedded systems. For comparison, wavelet analysis and ensemble empirical mode decomposition (EEMD) are performed in a personal computer (PC). Experimental results show that data collected by piezoelectric ceramic sensors can be used for respiratory-rate and heart-rate detection with high accuracy. In addition, the dynamic smoothing can achieve high accuracy close to wavelet analysis and EEMD, while it has much lower complexity.

14.
BMC Bioinformatics ; 19(Suppl 9): 291, 2018 Aug 13.
Artigo em Inglês | MEDLINE | ID: mdl-30367596

RESUMO

BACKGROUND: Genome rearrangements describe changes in the genetic linkage relationship of large chromosomal regions, involving reversals, transpositions, block interchanges, deletions, insertions, fissions, fusions and translocations etc. Many algorithms for calculating rearrangement scenarios between two genomes have been proposed. Very often, the calculated rearrangement scenario is not unique for the same pair of permutations. Hence, how to decide which calculated rearrangement scenario is more biologically meaningful becomes an essential task. Up to now, several mechanisms for genome rearrangements have been studied. One important theory is that genome rearrangement may be mediated by repeats, especially for reversal events. Many reversal regions are found to be flanked by a pair of inverted repeats. As a result, whether there are repeats at the breakpoints of the calculated rearrangement events can shed a light on deciding whether the calculated rearrangement events is biologically meaningful. To our knowledge, there is no tool which can automatically identify rearrangement events and check whether there exist repeats at the breakpoints of each calculated rearrangement event. RESULTS: In this paper, we describe a new tool named GRSR which allows us to compare multiple unichromosomal genomes to identify "independent" (obvious) rearrangement events such as reversals, (inverted) block interchanges and (inverted) transpositions and automatically searches for repeats at the breakpoints of each rearrangement event. We apply our tool on the complete genomes of 28 Mycobacterium tuberculosis strains and 24 Shewanella strains respectively. In both Mycobacterium tuberculosis and Shewanella strains, our tool finds many reversal regions flanked by a pair of inverted repeats. In particular, the GRSR tool also finds an inverted transposition and an inverted block interchange in Shewanella, where the repeats at the ends of rearrangement regions remain unchanged after the rearrangement event. To our knowledge, this is the first time such a phenomenon for inverted transposition and inverted block interchange is reported in Shewanella. CONCLUSIONS: From the calculated results, there are many examples supporting the theory that the existence of repeats at the breakpoints of a rearrangement event can make the sequences at the breakpoints remain unchanged before and after the rearrangement events, suggesting that the conservation of ends could possibly be a popular phenomenon in many types of genome rearrangement events.


Assuntos
Algoritmos , Mapeamento Cromossômico/métodos , Rearranjo Gênico , Genoma Bacteriano , Mycobacterium tuberculosis/genética , Shewanella/genética , Translocação Genética
15.
BMC Bioinformatics ; 19(Suppl 1): 52, 2018 02 19.
Artigo em Inglês | MEDLINE | ID: mdl-29504891

RESUMO

BACKGROUND: The haplotype assembly problem for diploid is to find a pair of haplotypes from a given set of aligned Single Nucleotide Polymorphism (SNP) fragments (reads). It has many applications in association studies, drug design, and genetic research. Since this problem is computationally hard, both heuristic and exact algorithms have been designed for it. Although exact algorithms are much slower, they are still of great interest because they usually output significantly better solutions than heuristic algorithms in terms of popular measures such as the Minimum Error Correction (MEC) score, the number of switch errors, and the QAN50 score. Exact algorithms are also valuable because they can be used to witness how good a heuristic algorithm is. The best known exact algorithm is based on integer linear programming (ILP) and it is known that ILP can also be used to improve the output quality of every heuristic algorithm with a little decline in speed. Therefore, faster ILP models for the problem are highly demanded. RESULTS: As in previous studies, we consider not only the general case of the problem but also its all-heterozygous case where we assume that if a column of the input read matrix contains at least one 0 and one 1, then it corresponds to a heterozygous SNP site. For both cases, we design new ILP models for the haplotype assembly problem which aim at minimizing the MEC score. The new models are theoretically better because they contain significantly fewer constraints. More importantly, our experimental results show that for both simulated and real datasets, the new model for the all-heterozygous (respectively, general) case can usually be solved via CPLEX (an ILP solver) at least 5 times (respectively, twice) faster than the previous bests. Indeed, the running time can sometimes be 41 times better. CONCLUSIONS: This paper proposes a new ILP model for the haplotype assembly problem and its all-heterozygous case, respectively. Experiments with both real and simulated datasets show that the new models can be solved within much shorter time by CPLEX than the previous bests. We believe that the models can be used to improve heuristic algorithms as well.


Assuntos
Algoritmos , Haplótipos , Programação Linear , Heterozigoto , Humanos , Modelos Genéticos , Polimorfismo de Nucleotídeo Único
16.
J Fluoresc ; 28(4): 933-941, 2018 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-29938389

RESUMO

A new boron-dipyrromethene (BODIPY) fluorescent dye aimed at sensitively detecting hypochlorite anion (ClO-) has been designed, synthesized and characterized. The probe is comprised of a BODIPY fluorophore unit and a ClO- specific reactive group of amidoxime. The addition of hypochlorite results in a red-shift of absorption and emission spectra of the probe accompanied by a decrease of intensity and spectra changes (A500 and 1/I512) of the probe can achieve a good linearity to the concentration of ClO-. The fluorescence probe can react to ClO- rapidly (within 60 s) in a wide pH range (4-10) with high sensitivity (detection limit of 6.81 µM) and selectivity. The reaction mechanism has been proposed and confirmed by MS analysis, ClO- anion oxidizes amidoxime moiety to hydroxyl group and hydroxyl group is further oxidized to formyl group in the formation of a corresponding aldehyde compound. In addition, the probe has also been successfully applied to detect ClO- in tap water and river water samples by spiking a known amount of standard ClO-.

17.
Biomed Eng Online ; 17(Suppl 1): 133, 2018 Nov 20.
Artigo em Inglês | MEDLINE | ID: mdl-30458797

RESUMO

BACKGROUND: Pseudomonas aeruginosa is a common bacterium which is recognized for its association with hospital-acquired infections and its advanced antibiotic resistance mechanisms. Tuberculosis, one of the major causes of mortality, is initiated by the deposition of Mycobacterium tuberculosis. Accessory sequences shared by a subset of strains of a species play an important role in a species' evolution, antibiotic resistance and infectious potential. RESULTS: Here, with a multiple sequence aligner, we segmented 25 P. aeruginosa genomes and 28 M. tuberculosis genomes into core blocks (include sequences shared by all the input genomes) and dispensable blocks (include sequences shared by a subset of the input genomes), respectively. For each input genome, we then constructed a scaffold consisting of its core and dispensable blocks sorted by blocks' locations on the chromosomes. Consecutive dispensable blocks on these scaffold formed instable regions. After a comprehensive study of these instable regions, three characteristics of instable regions are summarized: instable regions were short, site specific and varied in different strains. Three DNA elements (directed repeats (DRs), transposons and integrons) were then studied to see whether these DNA elements are associated with the variation of instable regions. A pipeline was developed to search for DR pairs on the flank of every instable sequence. 27 DR pairs in P. aeruginosa strains and 6 pairs in M. tuberculosis strains were found to exist in the instable regions. On the average, 14% and 12% of instable regions in P. aeruginosa strains covered transposase genes and integrase genes, respectively. In M. tuberculosis strains, an average of 43% and 8% of instable regions contain transposase genes and integrase genes, respectively. CONCLUSIONS: Instable regions were short, site specific and varied in different strains for both P. aeruginosa and M. tuberculosis. Our experimental results showed that DRs, transposons and integrons may be associated with variation of instable regions.


Assuntos
Biologia Computacional , Genoma Bacteriano , Mycobacterium tuberculosis/genética , Pseudomonas aeruginosa/genética , Alinhamento de Sequência , Análise de Sequência de DNA , Elementos de DNA Transponíveis , Farmacorresistência Bacteriana , Deleção de Genes , Humanos , Integrases/metabolismo , Mutagênese Insercional , Recombinação Genética
18.
BMC Genomics ; 18(1): 268, 2017 03 29.
Artigo em Inglês | MEDLINE | ID: mdl-28356070

RESUMO

BACKGROUND: Genome rearrangement describes gross changes of chromosomal regions, plays an important role in evolutionary biology and has profound impacts on phenotype in organisms ranging from microbes to humans. With more and more complete genomes accomplished, lots of genomic comparisons have been conducted in order to find genome rearrangements and the mechanisms which underlie the rearrangement events. In our opinion, genomic comparison of different individuals/strains within the same species (pan-genome) is more helpful to reveal the mechanisms for genome rearrangements since genomes of the same species are much closer to each other. RESULTS: We study the mechanism for inversion events via core-genome scaffold comparison of different strains within the same species. We focus on two kinds of bacteria, Pseudomonas aeruginosa and Escherichia coli, and investigate the inversion events among different strains of the same species. We find an interesting phenomenon that long (larger than 10,000 bp) inversion regions are flanked by a pair of Inverted Repeats (IRs). This mechanism can also explain why the breakpoint reuses for inversion events happen. We study the prevalence of the phenomenon and find that it is a major mechanism for inversions. The other observation is that for different rearrangement events such as transposition and inverted block interchange, the two ends of the swapped regions are also associated with repeats so that after the rearrangement operations the two ends of the swapped regions remain unchanged. To our knowledge, this is the first time such a phenomenon is reported for transposition event. CONCLUSIONS: In both Pseudomonas aeruginosa and Escherichia coli strains, IRs were found at the two ends of long sequence inversions. The two ends of the inversion remained unchanged before and after the inversion event. The existence of IRs can explain the breakpoint reuse phenomenon. We also observed that other rearrangement operations such as transposition, inverted transposition, and inverted block interchange, had repeats (not necessarily inverted) at the ends of each segment, where the ends remained unchanged before and after the rearrangement operations. This suggests that the conservation of ends could possibly be a popular phenomenon in many types of chromosome rearrangement events.


Assuntos
Escherichia coli/genética , Genoma Bacteriano , Pseudomonas aeruginosa/genética , Inversão de Sequência/genética , Hibridização Genômica Comparativa , Rearranjo Gênico
20.
Sensors (Basel) ; 17(11)2017 Nov 06.
Artigo em Inglês | MEDLINE | ID: mdl-29113109

RESUMO

In future scenarios of heterogeneous and dense networks, randomly-deployed small star networks (SSNs) become a key paradigm, whose system performance is restricted to inter-SSN interference and requires an efficient resource allocation scheme for interference coordination. Traditional resource allocation schemes do not specifically focus on this paradigm and are usually too time consuming in dense networks. In this article, a very efficient graph-based scheme is proposed, which applies the maximal independent set (MIS) concept in graph theory to help divide SSNs into almost interference-free groups. We first construct an interference graph for the system based on a derived distance threshold indicating for any pair of SSNs whether there is intolerable inter-SSN interference or not. Then, SSNs are divided into MISs, and the same resource can be repetitively used by all the SSNs in each MIS. Empirical parameters and equations are set in the scheme to guarantee high performance. Finally, extensive scenarios both dense and nondense are randomly generated and simulated to demonstrate the performance of our scheme, indicating that it outperforms the classical max K-cut-based scheme in terms of system capacity, utility and especially time cost. Its achieved system capacity, utility and fairness can be close to the near-optimal strategy obtained by a time-consuming simulated annealing search.

SELEÇÃO DE REFERÊNCIAS
Detalhe da pesquisa