Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 8 de 8
Filter
Add more filters










Database
Language
Publication year range
1.
Anal Chem ; 94(35): 11967-11972, 2022 09 06.
Article in English | MEDLINE | ID: mdl-35998076

ABSTRACT

One of the main challenges in cancer management relates to the discovery of reliable biomarkers, which could guide decision-making and predict treatment outcome. In particular, the rise and democratization of high-throughput molecular profiling technologies bolstered the discovery of "biomarker signatures" that could maximize the prediction performance. Such an approach was largely employed from diverse OMICs data (i.e., genomics, transcriptomics, proteomics, metabolomics) but not from epitranscriptomics, which encompasses more than 100 biochemical modifications driving the post-transcriptional fate of RNA: stability, splicing, storage, and translation. We and others have studied chemical marks in isolation and associated them with cancer evolution, adaptation, as well as the response to conventional therapy. In this study, we have designed a unique pipeline combining multiplex analysis of the epitranscriptomic landscape by high-performance liquid chromatography coupled to tandem mass spectrometry with statistical multivariate analysis and machine learning approaches in order to identify biomarker signatures that could guide precision medicine and improve disease diagnosis. We applied this approach to analyze a cohort of adult diffuse glioma patients and demonstrate the existence of an "epitranscriptomics-based signature" that permits glioma grades to be discriminated and predicted with unmet accuracy. This study demonstrates that epitranscriptomics (co)evolves along cancer progression and opens new prospects in the field of omics molecular profiling and personalized medicine.


Subject(s)
Glioma , RNA , Biomarkers , Glioma/diagnosis , Glioma/genetics , Humans , Metabolomics/methods , Multivariate Analysis , Proteomics/methods
2.
BMC Genomics ; 18(1): 730, 2017 Sep 15.
Article in English | MEDLINE | ID: mdl-28915793

ABSTRACT

BACKGROUND: Theobroma cacao L., native to the Amazonian basin of South America, is an economically important fruit tree crop for tropical countries as a source of chocolate. The first draft genome of the species, from a Criollo cultivar, was published in 2011. Although a useful resource, some improvements are possible, including identifying misassemblies, reducing the number of scaffolds and gaps, and anchoring un-anchored sequences to the 10 chromosomes. METHODS: We used a NGS-based approach to significantly improve the assembly of the Belizian Criollo B97-61/B2 genome. We combined four Illumina large insert size mate paired libraries with 52x of Pacific Biosciences long reads to correct misassembled regions and reduced the number of scaffolds. We then used genotyping by sequencing (GBS) methods to increase the proportion of the assembly anchored to chromosomes. RESULTS: The scaffold number decreased from 4,792 in assembly V1 to 554 in V2 while the scaffold N50 size has increased from 0.47 Mb in V1 to 6.5 Mb in V2. A total of 96.7% of the assembly was anchored to the 10 chromosomes compared to 66.8% in the previous version. Unknown sites (Ns) were reduced from 10.8% to 5.7%. In addition, we updated the functional annotations and performed a new RefSeq structural annotation based on RNAseq evidence. CONCLUSION: Theobroma cacao Criollo genome version 2 will be a valuable resource for the investigation of complex traits at the genomic level and for future comparative genomics and genetics studies in cacao tree. New functional tools and annotations are available on the Cocoa Genome Hub ( http://cocoa-genome-hub.southgreen.fr ).


Subject(s)
Cacao/genetics , Genomics/methods , Chromosomes, Plant/genetics , Genome, Plant/genetics , High-Throughput Nucleotide Sequencing , Molecular Sequence Annotation
4.
Pac Symp Biocomput ; : 254-65, 1999.
Article in English | MEDLINE | ID: mdl-10380202

ABSTRACT

Suppose that a biologist wishes to study some local property P of genetic sequences. If he can design (with a computer scientist) an algorithm C which efficiently compresses parts of the sequence which satisfy P, then our algorithm TurboOptLift locates very quickly where property P occurs by chance on a sequence, and where it occurs as a result of a significant process. Under some conditions, the time complexity of TurboOptLift is O(n log n). We illustrate its use on the practical problem of locating approximate tandem repeats in DNA sequences.


Subject(s)
Base Sequence , DNA/chemistry , Repetitive Sequences, Nucleic Acid , Sequence Alignment , Algorithms , Computational Biology/methods , Computer Simulation , Computing Methodologies , DNA/genetics , Software
5.
Bioinformatics ; 15(3): 194-202, 1999 Mar.
Article in English | MEDLINE | ID: mdl-10222406

ABSTRACT

MOTIVATION: Evolution acts in several ways on DNA: either by mutating a base, or by inserting, deleting or copying a segment of the sequence (Ruddle, 1997; Russell, 1994; Li and Grauer, 1991). Classical alignment methods deal with point mutations (Waterman, 1995), genome-level mutations are studied using genome rearrangement distances (Bafna and Pevzner, 1993, 1995; Kececioglu and Sankoff, 1994; Kececioglu and Ravi, 1995). The latter distances generally operate, not on the sequences, but on an ordered list of genes. To our knowledge, no measure of distance attempts to compare sequences using a general set of segment-based operations. RESULTS: Here we define a new family of distances, called transformation distances, which quantify the dissimilarity between two sequences in terms of segment-based events. We focus on the case where segment-copy, -reverse-copy and -insertion are allowed in our set of operations. Those events are weighted by their description length, but other sets of weights are possible when biological information is available. The transformation distance from sequence S to sequence T is then the Minimum Description Length among all possible scripts that build T knowing S with segment-based operations. The underlying idea is related to Kolmogorov complexity theory. We present an algorithm which, given two sequences S and T, computes exactly and efficiently the transformation distance from S to T. Unlike alignment methods, the method we propose does not necessarily respect the order of the residues within the compared sequences and is therefore able to account for duplications and translocations that cannot be properly described by sequence alignment. A biological application on Tnt1 tobacco retrotransposon is presented. AVAILABILITY: The algorithm and the graphical interface can be downloaded at http://www.lifl.fr/ approximately varre/TD


Subject(s)
Algorithms , Evolution, Molecular , Transformation, Genetic , Base Sequence , Gene Rearrangement , Molecular Sequence Data , Mutation , Plants, Toxic , RNA/genetics , Retroelements/genetics , Sequence Alignment , Sequence Homology, Nucleic Acid , Software , Terminal Repeat Sequences , Nicotiana/genetics
6.
Nucleic Acids Res ; 26(11): 2740-6, 1998 Jun 01.
Article in English | MEDLINE | ID: mdl-9592163

ABSTRACT

The leucine zipper is a dimerization domain occurring mostly in regulatory and thus in many oncogenic proteins. The leucine repeat in the sequence has been traditionally used for identification, however with poor reliability. The coiled coil structure of a leucine zipper is required for dimerization and can be predicted with reasonable accuracy by existing algorithms. We exploit this fact for identification of leucine zippers from sequence alone. We present a program, 2ZIP, which combines a standard coiled coil prediction algorithm with an approximate search for the characteristic leucine repeat. No further information from homologues is required for prediction. This approach improves significantly over existing methods, especially in that the coiled coil prediction turns out to be highly informative and avoids large numbers of false positives. Many problems in predicting zippers or assessing prediction results stem from wrong sequence annotations in the database.


Subject(s)
Algorithms , Leucine Zippers , Software , Amino Acid Sequence , Molecular Sequence Data
7.
Comput Appl Biosci ; 13(2): 131-6, 1997 Apr.
Article in English | MEDLINE | ID: mdl-9146959

ABSTRACT

MOTIVATION: Compression algorithms can be used to analyse genetic sequences. A compression algorithm tests a given property on the sequence and uses it to encode the sequence: if the property is true, it reveals some structure of the sequence which can be described briefly, this yields a description of the sequence which is shorter than the sequence of nucleotides given in extenso. The more a sequence is compressed by the algorithm, the more significant is the property for that sequence. RESULTS: We present a compression algorithm that tests the presence of a particular type of dosDNA (defined ordered sequence-DNA): approximate tandem repeats of small motifs (i.e. of lengths < 4). This algorithm has been experimented with on four yeast chromosomes. The presence of approximate tandem repeats seems to be a uniform structural property of yeast chromosomes.


Subject(s)
Algorithms , DNA/genetics , Repetitive Sequences, Nucleic Acid , Base Sequence , Chromosomes, Fungal/genetics , DNA, Fungal/genetics , Evaluation Studies as Topic , Molecular Sequence Data , Saccharomyces cerevisiae/genetics , Sequence Analysis, DNA/methods , Sequence Analysis, DNA/statistics & numerical data , Software
8.
Biochimie ; 78(5): 315-22, 1996.
Article in English | MEDLINE | ID: mdl-8905150

ABSTRACT

A novel approach to genetic sequence analysis is presented. This approach, based on compression of algorithms, has been launched simultaneously by Grumbach and Tahi, Milosavljevic and Rivals. To reduce the description of an object, a compression algorithm replaces some regularities in the description by special codes. Thus a compression algorithm can be applied to a sequence in order to study the presence of those regularities all over the sequence. This paper explains this ability, gives examples of compression algorithms already developed and mentions their applications. Finally, the theoretical foundations of the approach are presented in an overview of the algorithmic theory of information.


Subject(s)
Sequence Analysis/methods , Algorithms , Information Systems , Information Theory , Repetitive Sequences, Nucleic Acid
SELECTION OF CITATIONS
SEARCH DETAIL