Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 18 de 18
Filtrar
1.
Open Res Eur ; 3: 185, 2023.
Artículo en Inglés | MEDLINE | ID: mdl-38009089

RESUMEN

Software development has become an integral part of the scholarly ecosystem, spanning all fields and disciplines. To support the sharing and creation of knowledge in line with open science principles, and particularly to enable the reproducibility of research results, it is crucial to make the source code of research software available, allowing for modification, reuse, and distribution. Recognizing the significance of open-source software contributions in academia, the second French Plan for Open Science, announced by the Minister of Higher Education and Research in 2021, introduced a National Award to promote open-source research software. This award serves multiple objectives: firstly, to highlight the software projects and teams that have devoted time and effort to develop outstanding research software, sometimes for decades, and often with little recognition; secondly, to draw attention to the importance of software as a valuable research output and to inspire new generations of researchers to follow and learn from these examples. We present here an in-depth analysis of the design and implementation of this unique initiative. As a national award established explicitly to foster Open Science practices by the French Minister of Research, it faced the intricate challenge of fairly evaluating open research software across all fields, striving for inclusivity across domains, applications, and participants. We provide a comprehensive report on the results of the first edition, which received 129 high-quality submissions. Additionally, we emphasize the impact of this initiative on the open science landscape, promoting software as a valuable research outcome, on par with publications.


Software is crucial for modern research. For the goals of open science, reproducibility, and wider reuse, sharing software source code and acknowledging software development are essential. In France, in 2021, the Minister of Higher Education and Research introduced the National Plan for Open Science. The plan highlights the role of open-source software in academia and aims to give software the same recognition as publications and data. A part of the plan is the introduction of a National Award to recognize open-source research software contributions. This award acknowledges software projects and their teams, which have often worked without much recognition. It also emphasizes the importance of software as a research output, hoping to inspire future researchers. This article examines the award's design and implementation. It addresses the challenges of assessing open research software from different research fields. In the first edition of the award, there were 129 high-quality submissions, indicating the award's potential to shift perspectives on software's role in open science, aligning it with the importance of academic publications. Through a detailed account of our experiences and the insights gained, we aim to provide a reference for other countries or institutions considering to establish similar recognitions.

2.
Methods Mol Biol ; 2453: 43-59, 2022.
Artículo en Inglés | MEDLINE | ID: mdl-35622319

RESUMEN

Within the EuroClonality-NGS group, immune repertoire analysis for target identification in lymphoid malignancies was initially developed using two-stage amplicon approaches, essentially as a progressive modification of preceding methods developed for Sanger sequencing. This approach has, however, limitations with respect to sample handling, adaptation to automation, and risk of contamination by amplicon products. We therefore developed one-step PCR amplicon methods with individual barcoding for batched analysis for IGH, IGK, TRD, TRG, and TRB rearrangements, followed by Vidjil-based data analysis.


Asunto(s)
Genes Codificadores de los Receptores de Linfocitos T , Secuenciación de Nucleótidos de Alto Rendimiento , Inmunoglobulinas , Leucemia-Linfoma Linfoblástico de Células Precursoras , Recombinación Genética , Genes Codificadores de los Receptores de Linfocitos T/genética , Genes Codificadores de los Receptores de Linfocitos T/inmunología , Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Humanos , Inmunoglobulinas/genética , Inmunoglobulinas/inmunología , Neoplasia Residual/diagnóstico , Neoplasia Residual/genética , Leucemia-Linfoma Linfoblástico de Células Precursoras/diagnóstico , Leucemia-Linfoma Linfoblástico de Células Precursoras/genética , Leucemia-Linfoma Linfoblástico de Células Precursoras/inmunología , Recombinación Genética/genética , Recombinación Genética/inmunología
3.
Methods Mol Biol ; 2453: 153-167, 2022.
Artículo en Inglés | MEDLINE | ID: mdl-35622326

RESUMEN

B cell receptor (BcR) immunoglobulins (IG) display a tremendous diversity due to complex DNA rearrangements, the V(D)J recombination, further enhanced by the somatic hypermutation process. In chronic lymphocytic leukemia (CLL), the mutational load of the clonal BcR IG expressed by the leukemic cells constitutes an important prognostic and predictive biomarker. Here, we provide a reliable methodology capable of determining the mutational status of IG genes in CLL using high-throughput sequencing, starting from leukemic cell DNA or RNA.


Asunto(s)
Leucemia Linfocítica Crónica de Células B , Genes de Inmunoglobulinas , Secuenciación de Nucleótidos de Alto Rendimiento , Humanos , Inmunoglobulinas/genética , Leucemia Linfocítica Crónica de Células B/genética , Receptores de Antígenos de Linfocitos B/genética
4.
Leukemia ; 33(9): 2241-2253, 2019 09.
Artículo en Inglés | MEDLINE | ID: mdl-31243313

RESUMEN

Amplicon-based next-generation sequencing (NGS) of immunoglobulin (IG) and T-cell receptor (TR) gene rearrangements for clonality assessment, marker identification and quantification of minimal residual disease (MRD) in lymphoid neoplasms has been the focus of intense research, development and application. However, standardization and validation in a scientifically controlled multicentre setting is still lacking. Therefore, IG/TR assay development and design, including bioinformatics, was performed within the EuroClonality-NGS working group and validated for MRD marker identification in acute lymphoblastic leukaemia (ALL). Five EuroMRD ALL reference laboratories performed IG/TR NGS in 50 diagnostic ALL samples, and compared results with those generated through routine IG/TR Sanger sequencing. A central polytarget quality control (cPT-QC) was used to monitor primer performance, and a central in-tube quality control (cIT-QC) was spiked into each sample as a library-specific quality control and calibrator. NGS identified 259 (average 5.2/sample, range 0-14) clonal sequences vs. Sanger-sequencing 248 (average 5.0/sample, range 0-14). NGS primers covered possible IG/TR rearrangement types more completely compared with local multiplex PCR sets and enabled sequencing of bi-allelic rearrangements and weak PCR products. The cPT-QC showed high reproducibility across all laboratories. These validated and reproducible quality-controlled EuroClonality-NGS assays can be used for standardized NGS-based identification of IG/TR markers in lymphoid malignancies.


Asunto(s)
Reordenamiento Génico de Linfocito T/genética , Genes Codificadores de los Receptores de Linfocitos T/genética , Marcadores Genéticos/genética , Inmunoglobulinas/genética , Neoplasia Residual/genética , Leucemia-Linfoma Linfoblástico de Células Precursoras/genética , Biología Computacional/métodos , Genes de Inmunoglobulinas/genética , Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Humanos , Receptores de Antígenos de Linfocitos T/genética , Recombinación Genética/genética , Estándares de Referencia , Reproducibilidad de los Resultados
5.
PeerJ Comput Sci ; 4: e148, 2018.
Artículo en Inglés | MEDLINE | ID: mdl-33816803

RESUMEN

BACKGROUND: Labels are a way to add some information on a text, such as functional annotations such as genes on a DNA sequences. V(D)J recombinations are DNA recombinations involving two or three short genes in lymphocytes. Sequencing this short region (500 bp or less) produces labeled sequences and brings insight in the lymphocyte repertoire for onco-hematology or immunology studies. METHODS: We present two indexes for a text with non-overlapping labels. They store the text in a Burrows-Wheeler transform (BWT) and a compressed label sequence in a Wavelet Tree. The label sequence is taken in the order of the text (TL-index) or in the order of the BWT (TLBW-index). Both indexes need a space related to the entropy of the labeled text. RESULTS: These indexes allow efficient text-label queries to count and find labeled patterns. The TLBW-index has an overhead on simple label queries but is very efficient on combined pattern-label queries. We implemented the indexes in C++ and compared them against a baseline solution on pseudo-random as well as on V(D)J labeled texts. DISCUSSION: New indexes such as the ones we proposed improve the way we index and query labeled texts as, for instance, lymphocyte repertoire for hematological and immunological studies.

6.
J Immunol ; 198(10): 3765-3774, 2017 05 15.
Artículo en Inglés | MEDLINE | ID: mdl-28416603

RESUMEN

Analysis and interpretation of Ig and TCR gene rearrangements in the conventional, low-throughput way have their limitations in terms of resolution, coverage, and biases. With the advent of high-throughput, next-generation sequencing (NGS) technologies, a deeper analysis of Ig and/or TCR (IG/TR) gene rearrangements is now within reach, which impacts on all main applications of IG/TR immunogenetic analysis. To bridge the generation gap from low- to high-throughput analysis, the EuroClonality-NGS Consortium has been formed, with the main objectives to develop, standardize, and validate the entire workflow of IG/TR NGS assays for 1) clonality assessment, 2) minimal residual disease detection, and 3) repertoire analysis. This concerns the preanalytical (sample preparation, target choice), analytical (amplification, NGS), and postanalytical (immunoinformatics) phases. Here we critically discuss pitfalls and challenges of IG/TR NGS methodology and its applications in hemato-oncology and immunology.


Asunto(s)
Hematología/métodos , Secuenciación de Nucleótidos de Alto Rendimiento , Inmunogenética/métodos , Técnicas Inmunológicas , Alelos , Biología Computacional/métodos , Reordenamiento Génico , Genes de Inmunoglobulinas , Genes Codificadores de los Receptores de Linfocitos T/genética , Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Humanos , Inmunogenética/normas
7.
PLoS One ; 12(2): e0172249, 2017.
Artículo en Inglés | MEDLINE | ID: mdl-28182777

RESUMEN

[This corrects the article DOI: 10.1371/journal.pone.0166126.].

8.
Leuk Res ; 53: 1-7, 2017 02.
Artículo en Inglés | MEDLINE | ID: mdl-27930944

RESUMEN

Minimal residual disease (MRD) is known to be an independent prognostic factor in patients with acute lymphoblastic leukemia (ALL). High-throughput sequencing (HTS) is currently used in routine practice for the diagnosis and follow-up of patients with hematological neoplasms. In this retrospective study, we examined the role of immunoglobulin/T-cell receptor-based MRD in patients with ALL by HTS analysis of immunoglobulin H and/or T-cell receptor gamma chain loci in bone marrow samples from 11 patients with ALL, at diagnosis and during follow-up. We assessed the clinical feasibility of using combined HTS and bioinformatics analysis with interactive visualization using Vidjil software. We discuss the advantages and drawbacks of HTS for monitoring MRD. HTS gives a more complete insight of the leukemic population than conventional real-time quantitative PCR (qPCR), and allows identification of new emerging clones at each time point of the monitoring. Thus, HTS monitoring of Ig/TR based MRD is expected to improve the management of patients with ALL.


Asunto(s)
Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Neoplasia Residual/diagnóstico , Leucemia-Linfoma Linfoblástico de Células Precursoras/diagnóstico , Médula Ósea , Células Clonales/patología , Estudios de Seguimiento , Genes Codificadores de la Cadena gamma de los Receptores de Linfocito T , Humanos , Cadenas Pesadas de Inmunoglobulina/genética , Monitorización Inmunológica , Neoplasia Residual/genética , Leucemia-Linfoma Linfoblástico de Células Precursoras/genética , Estudios Retrospectivos , Programas Informáticos
9.
PLoS One ; 11(11): e0166126, 2016.
Artículo en Inglés | MEDLINE | ID: mdl-27835690

RESUMEN

BACKGROUND: The B and T lymphocytes are white blood cells playing a key role in the adaptive immunity. A part of their DNA, called the V(D)J recombinations, is specific to each lymphocyte, and enables recognition of specific antigenes. Today, with new sequencing techniques, one can get billions of DNA sequences from these regions. With dedicated Repertoire Sequencing (RepSeq) methods, it is now possible to picture population of lymphocytes, and to monitor more accurately the immune response as well as pathologies such as leukemia. METHODS AND RESULTS: Vidjil is an open-source platform for the interactive analysis of high-throughput sequencing data from lymphocyte recombinations. It contains an algorithm gathering reads into clonotypes according to their V(D)J junctions, a web application made of a sample, experiment and patient database and a visualization for the analysis of clonotypes along the time. Vidjil is implemented in C++, Python and Javascript and licensed under the GPLv3 open-source license. Source code, binaries and a public web server are available at http://www.vidjil.org and at http://bioinfo.lille.inria.fr/vidjil. Using the Vidjil web application consists of four steps: 1. uploading a raw sequence file (typically a FASTQ); 2. running RepSeq analysis software; 3. visualizing the results; 4. annotating the results and saving them for future use. For the end-user, the Vidjil web application needs no specific installation and just requires a connection and a modern web browser. Vidjil is used by labs in hematology or immunology for research and clinical applications.


Asunto(s)
Biología Computacional/métodos , Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Recombinación V(D)J/genética , Navegador Web , Algoritmos , Secuencia de Bases , Humanos , Internet , Linfocitos/inmunología , Linfocitos/metabolismo , Reproducibilidad de los Resultados , Homología de Secuencia de Ácido Nucleico
10.
Br J Haematol ; 173(3): 413-20, 2016 05.
Artículo en Inglés | MEDLINE | ID: mdl-26898266

RESUMEN

High-throughput sequencing (HTS) is considered a technical revolution that has improved our knowledge of lymphoid and autoimmune diseases, changing our approach to leukaemia both at diagnosis and during follow-up. As part of an immunoglobulin/T cell receptor-based minimal residual disease (MRD) assessment of acute lymphoblastic leukaemia patients, we assessed the performance and feasibility of the replacement of the first steps of the approach based on DNA isolation and Sanger sequencing, using a HTS protocol combined with bioinformatics analysis and visualization using the Vidjil software. We prospectively analysed the diagnostic and relapse samples of 34 paediatric patients, thus identifying 125 leukaemic clones with recombinations on multiple loci (TRG, TRD, IGH and IGK), including Dd2/Dd3 and Intron/KDE rearrangements. Sequencing failures were halved (14% vs. 34%, P = 0.0007), enabling more patients to be monitored. Furthermore, more markers per patient could be monitored, reducing the probability of false negative MRD results. The whole analysis, from sample receipt to clinical validation, was shorter than our current diagnostic protocol, with equal resources. V(D)J recombination was successfully assigned by the software, even for unusual recombinations. This study emphasizes the progress that HTS with adapted bioinformatics tools can bring to the diagnosis of leukaemia patients.


Asunto(s)
Biología Computacional/métodos , Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Leucemia-Linfoma Linfoblástico de Células Precursoras/diagnóstico , Adolescente , Adulto , Niño , Preescolar , Células Clonales , Errores Diagnósticos/prevención & control , Reordenamiento Génico de Linfocito T , Secuenciación de Nucleótidos de Alto Rendimiento/normas , Humanos , Lactante , Recién Nacido , Neoplasia Residual/diagnóstico , Estudios Prospectivos , Programas Informáticos , Recombinación V(D)J/genética , Adulto Joven
12.
J Comput Biol ; 22(3): 190-204, 2015 Mar.
Artículo en Inglés | MEDLINE | ID: mdl-25768235

RESUMEN

We introduce the concept of RNA multistructures, which is a formal grammar-based framework specifically designed to model a set of alternate RNA secondary structures. Such alternate structures can either be a set of suboptimal foldings, or distinct stable folding states, or variants within an RNA family. We provide several such examples and propose an efficient algorithm to search for RNA multistructures within a genomic sequence.


Asunto(s)
Pliegue del ARN , ARN de Transferencia/química , ARN/química , Algoritmos , Proteínas Bacterianas/química , Genoma , Humanos , Secuencias Invertidas Repetidas , Modelos Moleculares , ARN Bacteriano/química , ARN Mitocondrial , Ribonucleasa P/química , Riboswitch
13.
BMC Genomics ; 15: 409, 2014 May 28.
Artículo en Inglés | MEDLINE | ID: mdl-24885090

RESUMEN

BACKGROUND: V(D)J recombinations in lymphocytes are essential for immunological diversity. They are also useful markers of pathologies. In leukemia, they are used to quantify the minimal residual disease during patient follow-up. However, the full breadth of lymphocyte diversity is not fully understood. RESULTS: We propose new algorithms that process high-throughput sequencing (HTS) data to extract unnamed V(D)J junctions and gather them into clones for quantification. This analysis is based on a seed heuristic and is fast and scalable because in the first phase, no alignment is performed with germline database sequences. The algorithms were applied to TR γ HTS data from a patient with acute lymphoblastic leukemia, and also on data simulating hypermutations. Our methods identified the main clone, as well as additional clones that were not identified with standard protocols. CONCLUSIONS: The proposed algorithms provide new insight into the analysis of high-throughput sequencing data for leukemia, and also to the quantitative assessment of any immunological profile. The methods described here are implemented in a C++ open-source program called Vidjil.


Asunto(s)
Algoritmos , Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Leucemia-Linfoma Linfoblástico de Células Precursoras/diagnóstico , Análisis de Secuencia de ADN/métodos , Recombinación V(D)J , Humanos , Neoplasia Residual/diagnóstico , Leucemia-Linfoma Linfoblástico de Células Precursoras/genética , Programas Informáticos
14.
Int J Comput Biol Drug Des ; 6(1-2): 119-30, 2013.
Artículo en Inglés | MEDLINE | ID: mdl-23428478

RESUMEN

In this paper, we present a solution to the extreme similarity sequencing problem. The extreme similarity sequencing problem consists of finding occurrences of a pattern p in a set S(0), S(1), …, S(k), of sequences of equal length, where S(i), for all 1≤i≤k, differs from S(0) by a constant number of errors - around 10 in practice. We present an asymptotically fast O(n + occ logocc) time algorithm, as well as a practical O(nk/w) time algorithm for solving this problem, where n is the length of a sequence, occ is the number of candidate occurrences reported by our technique, w is the size of the machine word, and the total number of errors is bounded by k - the number of sequences.


Asunto(s)
Algoritmos , Análisis de Secuencia de ADN/métodos , Homología de Secuencia de Ácido Nucleico , Biología Computacional , Secuenciación de Nucleótidos de Alto Rendimiento
15.
J Comput Biol ; 19(10): 1120-33, 2012 Oct.
Artículo en Inglés | MEDLINE | ID: mdl-23057822

RESUMEN

RNA locally optimal secondary structures provide a concise and exhaustive description of all possible secondary structures of a given RNA sequence, and hence a very good representation of the RNA folding space. In this paper, we present an efficient algorithm that computes all locally optimal secondary structures for any folding model that takes into account the stability of helical regions. This algorithm is implemented in a software called regliss that runs on a publicly accessible web server.


Asunto(s)
Algoritmos , Internet , Conformación de Ácido Nucleico , ARN , Análisis de Secuencia de ARN/métodos , Programas Informáticos , ARN/química , ARN/genética
16.
BMC Bioinformatics ; 9: 534, 2008 Dec 16.
Artículo en Inglés | MEDLINE | ID: mdl-19087280

RESUMEN

BACKGROUND: Similarity inference, one of the main bioinformatics tasks, has to face an exponential growth of the biological data. A classical approach used to cope with this data flow involves heuristics with large seed indexes. In order to speed up this technique, the index can be enhanced by storing additional information to limit the number of random memory accesses. However, this improvement leads to a larger index that may become a bottleneck. In the case of protein similarity search, we propose to decrease the index size by reducing the amino acid alphabet. RESULTS: The paper presents two main contributions. First, we show that an optimal neighborhood indexing combining an alphabet reduction and a longer neighborhood leads to a reduction of 35% of memory involved into the process, without sacrificing the quality of results nor the computational time. Second, our approach led us to develop a new kind of substitution score matrices and their associated e-value parameters. In contrast to usual matrices, these matrices are rectangular since they compare amino acid groups from different alphabets. We describe the method used for computing those matrices and we provide some typical examples that can be used in such comparisons. Supplementary data can be found on the website http://bioinfo.lifl.fr/reblosum. CONCLUSION: We propose a practical index size reduction of the neighborhood data, that does not negatively affect the performance of large-scale search in protein sequences. Such an index can be used in any study involving large protein data. Moreover, rectangular substitution score matrices and their associated statistical parameters can have applications in any study involving an alphabet reduction.


Asunto(s)
Indización y Redacción de Resúmenes/métodos , Algoritmos , Biología Computacional/métodos , Alineación de Secuencia/métodos , Análisis de Secuencia de Proteína/métodos , Bases de Datos de Proteínas , Almacenamiento y Recuperación de la Información , Proteínas/química
17.
Bioinformatics ; 22(16): 1948-54, 2006 Aug 15.
Artículo en Inglés | MEDLINE | ID: mdl-16809391

RESUMEN

MOTIVATION: The analysis of repeated elements in genomes is a fascinating domain of research that is lacking relevant tools for transposable elements (TEs), the most complex ones. The dynamics of TEs, which provides the main mechanism of mutation in some genomes, is an essential component of genome evolution. In this study we introduce a new concept of domain, a segmentation unit useful for describing the architecture of different copies of TEs. Our method extracts occurrences of a terminus-defined family of TEs, aligns the sequences, finds the domains in the alignment and searches the distribution of each domain in sequences. After a classification step relative to the presence or the absence of domains, the method results in a graphical view of sequences segmented into domains. RESULTS: Analysis of the new non-autonomous TE AtREP21 in the model plant Arabidopsis thaliana reveals copies of very different sizes and various combinations of domains which show the potential of our method. AVAILABILITY: DomainOrganizer web page is available at www.irisa.fr/symbiose/DomainOrganizer/.


Asunto(s)
Biología Computacional/métodos , Elementos Transponibles de ADN/genética , Análisis de Secuencia de ADN/métodos , Algoritmos , Secuencia de Aminoácidos , Arabidopsis/genética , Genes de Plantas , Cadenas de Markov , Modelos Biológicos , Modelos Estadísticos , Datos de Secuencia Molecular , Proteínas de Plantas/química , Estructura Terciaria de Proteína
18.
Genome Biol ; 6(10): R83, 2005.
Artículo en Inglés | MEDLINE | ID: mdl-16207354

RESUMEN

BACKGROUND: Dogs and rats have a highly developed capability to detect and identify odorant molecules, even at minute concentrations. Previous analyses have shown that the olfactory receptors (ORs) that specifically bind odorant molecules are encoded by the largest gene family sequenced in mammals so far. RESULTS: We identified five amino acid patterns characteristic of ORs in the recently sequenced boxer dog and brown Norway rat genomes. Using these patterns, we retrieved 1,094 dog genes and 1,493 rat genes from these shotgun sequences. The retrieved sequences constitute the olfactory receptor repertoires of these two animals. Subsets of 20.3% (for the dog) and 19.5% (for the rat) of these genes were annotated as pseudogenes as they had one or several mutations interrupting their open reading frames. We performed phylogenetic studies and organized these two repertoires into classes, families and subfamilies. CONCLUSION: We have established a complete or almost complete list of OR genes in the dog and the rat and have compared the sequences of these genes within and between the two species. Our results provide insight into the evolutionary development of these genes and the local amplifications that have led to the specific amplification of many subfamilies. We have also compared the human and rat ORs with the human and mouse OR repertoires.


Asunto(s)
Receptores Odorantes/química , Animales , Secuencia Conservada , Perros , Genoma/genética , Familia de Multigenes , Filogenia , Seudogenes/genética , Ratas , Análisis de Secuencia de Proteína , Homología de Secuencia de Aminoácido
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA
...