Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 16 de 16
Filtrar
Más filtros

Banco de datos
Tipo del documento
Intervalo de año de publicación
1.
Brain ; 147(1): 281-296, 2024 01 04.
Artículo en Inglés | MEDLINE | ID: mdl-37721175

RESUMEN

Congenital myasthenic syndromes (CMS) are a rare group of inherited disorders caused by gene defects associated with the neuromuscular junction and potentially treatable with commonly available medications such as acetylcholinesterase inhibitors and ß2 adrenergic receptor agonists. In this study, we identified and genetically characterized the largest cohort of CMS patients from India to date. Genetic testing of clinically suspected patients evaluated in a South Indian hospital during the period 2014-19 was carried out by standard diagnostic gene panel testing or using a two-step method that included hotspot screening followed by whole-exome sequencing. In total, 156 genetically diagnosed patients (141 families) were characterized and the mutational spectrum and genotype-phenotype correlation described. Overall, 87 males and 69 females were evaluated, with the age of onset ranging from congenital to fourth decade (mean 6.6 ± 9.8 years). The mean age at diagnosis was 19 ± 12.8 (1-56 years), with a mean diagnostic delay of 12.5 ± 9.9 (0-49 years). Disease-causing variants in 17 CMS-associated genes were identified in 132 families (93.6%), while in nine families (6.4%), variants in genes not associated with CMS were found. Overall, postsynaptic defects were most common (62.4%), followed by glycosylation defects (21.3%), synaptic basal lamina genes (4.3%) and presynaptic defects (2.8%). Other genes found to cause neuromuscular junction defects (DES, TEFM) in our cohort accounted for 2.8%. Among the individual CMS genes, the most commonly affected gene was CHRNE (39.4%), followed by DOK7 (14.4%), DPAGT1 (9.8%), GFPT1 (7.6%), MUSK (6.1%), GMPPB (5.3%) and COLQ (4.5%). We identified 22 recurrent variants in this study, out of which eight were found to be geographically specific to the Indian subcontinent. Apart from the known common CHRNE variants p.E443Kfs*64 (11.4%) and DOK7 p.A378Sfs*30 (9.3%), we identified seven novel recurrent variants specific to this cohort, including DPAGT1 p.T380I and DES c.1023+5G>A, for which founder haplotypes are suspected. This study highlights the geographic differences in the frequencies of various causative CMS genes and underlines the increasing significance of glycosylation genes (DPAGT1, GFPT1 and GMPPB) as a cause of neuromuscular junction defects. Myopathy and muscular dystrophy genes such as GMPPB and DES, presenting as gradually progressive limb girdle CMS, expand the phenotypic spectrum. The novel genes MACF1 and TEFM identified in this cohort add to the expanding list of genes with new mechanisms causing neuromuscular junction defects.


Asunto(s)
Síndromes Miasténicos Congénitos , Masculino , Femenino , Humanos , Niño , Adolescente , Adulto Joven , Adulto , Síndromes Miasténicos Congénitos/diagnóstico , Acetilcolinesterasa , Diagnóstico Tardío , Unión Neuromuscular/genética , Pruebas Genéticas , Mutación/genética
2.
Gastric Cancer ; 26(5): 653-666, 2023 09.
Artículo en Inglés | MEDLINE | ID: mdl-37249750

RESUMEN

BACKGROUND: Germline CDH1 pathogenic or likely pathogenic variants cause hereditary diffuse gastric cancer (HDGC). Once a genetic cause is identified, stomachs' and breasts' surveillance and/or prophylactic surgery is offered to asymptomatic CDH1 carriers, which is life-saving. Herein, we characterized an inherited mechanism responsible for extremely early-onset gastric cancer and atypical HDGC high penetrance. METHODS: Whole-exome sequencing (WES) re-analysis was performed in an unsolved HDGC family. Accessible chromatin and CDH1 promoter interactors were evaluated in normal stomach by ATAC-seq and 4C-seq, and functional analysis was performed using CRISPR-Cas9, RNA-seq and pathway analysis. RESULTS: We identified a germline heterozygous 23 Kb CDH1-TANGO6 deletion in a family with eight diffuse gastric cancers, six before age 30. Atypical HDGC high penetrance and young cancer-onset argued towards a role for the deleted region downstream of CDH1, which we proved to present accessible chromatin, and CDH1 promoter interactors in normal stomach. CRISPR-Cas9 edited cells mimicking the CDH1-TANGO6 deletion display the strongest CDH1 mRNA downregulation, more impacted adhesion-associated, type-I interferon immune-associated and oncogenic signalling pathways, compared to wild-type or CDH1-deleted cells. This finding solved an 18-year family odyssey and engaged carrier family members in a cancer prevention pathway of care. CONCLUSION: In this work, we demonstrated that regulatory elements lying down-stream of CDH1 are part of a chromatin network that control CDH1 expression and influence cell transcriptome and associated signalling pathways, likely explaining high disease penetrance and very young cancer-onset. This study highlights the importance of incorporating scientific-technological updates and clinical guidelines in routine diagnosis, given their impact in timely genetic diagnosis and disease prevention.


Asunto(s)
Adenocarcinoma , Neoplasias Gástricas , Humanos , Adulto , Neoplasias Gástricas/patología , Penetrancia , Predisposición Genética a la Enfermedad , Cadherinas/genética , Cromatina , Mutación de Línea Germinal , Antígenos CD/genética
3.
Hum Mutat ; 37(12): 1263-1271, 2016 12.
Artículo en Inglés | MEDLINE | ID: mdl-27604516

RESUMEN

As whole genome sequencing becomes cheaper and faster, it will progressively substitute targeted next-generation sequencing as standard practice in research and diagnostics. However, computing cost-performance ratio is not advancing at an equivalent rate. Therefore, it is essential to evaluate the robustness of the variant detection process taking into account the computing resources required. We have benchmarked six combinations of state-of-the-art read aligners (BWA-MEM and GEM3) and variant callers (FreeBayes, GATK HaplotypeCaller, SAMtools) on whole genome and whole exome sequencing data from the NA12878 human sample. Results have been compared between them and against the NIST Genome in a Bottle (GIAB) variants reference dataset. We report differences in speed of up to 20 times in some steps of the process and have observed that SNV, and to a lesser extent InDel, detection is highly consistent in 70% of the genome. SNV, and especially InDel, detection is less reliable in 20% of the genome, and almost unfeasible in the remaining 10%. These findings will aid in choosing the appropriate tools bearing in mind objectives, workload, and computing infrastructure available.


Asunto(s)
Biología Computacional/métodos , Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Análisis de Secuencia de ADN/métodos , Exoma , Variación Genética , Genoma Humano , Humanos , Programas Informáticos
4.
Genome Res ; 22(3): 478-85, 2012 Mar.
Artículo en Inglés | MEDLINE | ID: mdl-22128134

RESUMEN

Insertions and deletions (indels), together with nucleotide substitutions, are major drivers of sequence evolution. An excess of deletions over insertions in genomic sequences-the so-called deletional bias-has been reported in a wide range of species, including mammals. However, this bias has not been found in the coding sequences of some mammalian species, such as human and mouse. To determine the strength of the deletional bias in mammals, and the influence of mutation and selection, we have quantified indels in both neutrally evolving noncoding sequences and protein-coding sequences, in six mammalian branches: human, macaque, ancestral primate, mouse, rat, and ancestral rodent. The results obtained with an improved algorithm for the placement of insertions in multiple alignments, Prank(+F), indicate that contrary to previous results, the only mammalian branch with a strong deletional bias is the rodent ancestral branch. We estimate that such a bias has resulted in an ~2.5% sequence loss of mammalian syntenic region in the ancestor of the mouse and rat. Further, a comparison of coding and noncoding sequences shows that negative selection is acting more strongly against mutations generating amino acid insertions than against mutations resulting in amino acid deletions. The strength of selection against indels is found to be higher in the rodent branches than in the primate branches, consistent with the larger effective population sizes of the rodents.


Asunto(s)
Mamíferos/genética , Eliminación de Secuencia , Secuencia de Aminoácidos , Animales , Bovinos , Evolución Molecular , Humanos , Macaca mulatta , Ratones , Datos de Secuencia Molecular , Mutagénesis Insercional , Sistemas de Lectura Abierta , ARN no Traducido , Ratas , Roedores/genética , Alineación de Secuencia , Secuencias Repetidas en Tándem
5.
Mol Biol Evol ; 30(8): 1830-42, 2013 Aug.
Artículo en Inglés | MEDLINE | ID: mdl-23625888

RESUMEN

Gene duplication is widely regarded as a major mechanism modeling genome evolution and function. However, the mechanisms that drive the evolution of the two, initially redundant, gene copies are still ill defined. Many gene duplicates experience evolutionary rate acceleration, but the relative contribution of positive selection and random drift to the retention and subsequent evolution of gene duplicates, and for how long the molecular clock may be distorted by these processes, remains unclear. Focusing on rodent genes that duplicated before and after the mouse and rat split, we find significantly increased sequence divergence after duplication in only one of the copies, which in nearly all cases corresponds to the novel daughter copy, independent of the mechanism of duplication. We observe that the evolutionary rate of the accelerated copy, measured as the ratio of nonsynonymous to synonymous substitutions, is on average 5-fold higher in the period spanning 4-12 My after the duplication than it was before the duplication. This increase can be explained, at least in part, by the action of positive selection according to the results of the maximum likelihood-based branch-site test. Subsequently, the rate decelerates until purifying selection completely returns to preduplication levels. Reversion to the original rates has already been accomplished 40.5 My after the duplication event, corresponding to a genetic distance of about 0.28 synonymous substitutions per site. Differences in tissue gene expression patterns parallel those of substitution rates, reinforcing the role of neofunctionalization in explaining the evolution of young gene duplicates.


Asunto(s)
Evolución Molecular , Duplicación de Gen , Genes Duplicados , Animales , Efectos de la Posición Cromosómica , Mutación INDEL , Ratones , Especificidad de Órganos/genética , Ratas , Selección Genética
6.
Gigascience ; 132024 Jan 02.
Artículo en Inglés | MEDLINE | ID: mdl-39302238

RESUMEN

The Solve-RD project brings together clinicians, scientists, and patient representatives from 51 institutes spanning 15 countries to collaborate on genetically diagnosing ("solving") rare diseases (RDs). The project aims to significantly increase the diagnostic success rate by co-analyzing data from thousands of RD cases, including phenotypes, pedigrees, exome/genome sequencing, and multiomics data. Here we report on the data infrastructure devised and created to support this co-analysis. This infrastructure enables users to store, find, connect, and analyze data and metadata in a collaborative manner. Pseudonymized phenotypic and raw experimental data are submitted to the RD-Connect Genome-Phenome Analysis Platform and processed through standardized pipelines. Resulting files and novel produced omics data are sent to the European Genome-Phenome Archive, which adds unique file identifiers and provides long-term storage and controlled access services. MOLGENIS "RD3" and Café Variome "Discovery Nexus" connect data and metadata and offer discovery services, and secure cloud-based "Sandboxes" support multiparty data analysis. This successfully deployed and useful infrastructure design provides a blueprint for other projects that need to analyze large amounts of heterogeneous data.


Asunto(s)
Enfermedades Raras , Enfermedades Raras/genética , Humanos , Bases de Datos Genéticas , Fenotipo , Metadatos , Biología Computacional/métodos , Genómica/métodos
7.
Mol Biol Evol ; 28(1): 383-98, 2011 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-20688808

RESUMEN

The molecular clock hypothesis states that protein-coding genes evolve at an approximately constant rate. However, this is only expected to be true as long as the function and the tertiary structure of the molecule remain unaltered. An important implication of this statement is that significant deviations in the rate of evolution of a gene with respect to the species clock are likely to reflect functional and/or structural alterations. Here, we present a method to identify such deviations and apply it to a data set of 2,929 high-quality coding sequence alignments corresponding to one-to-one orthologous genes from six mammalian species--human, macaque, mouse, rat, cow, and dog. Deviated branches are defined as those that present significant alterations in both the rate of nonsynonymous substitutions (dN) and the selective pressure (dN/dS). Strikingly, we find that as many as 24.5% of the genes show branch-specific deviations in dN and dN/dS, though this is a relatively well-conserved set of genes. Around half of these genes show branch-specific acceleration of evolutionary rates. Positive selection (PS) tests based on divergence data only identify 17.7% of the accelerated branches. Failure to identify PS in accelerated branches with an excess of radical amino acid replacements suggests that these tests are conservative. Interestingly, genes with accelerated branches are significantly enriched in neural proteins, indicating that this type of protein might play a more important role than previously thought in species diversification, although they are generally not detected by PS tests. We discuss in detail several examples of genes that show lineage-specific evolutionary rate acceleration and are involved in synaptic transmission, chemosensory perception, and ubiquitination.


Asunto(s)
Evolución Molecular , Mamíferos/genética , Selección Genética , Secuencia de Aminoácidos , Sustitución de Aminoácidos , Animales , Proteínas F-Box/genética , Variación Genética , Humanos , Datos de Secuencia Molecular , Receptores Acoplados a Proteínas G/genética , Receptores de N-Metil-D-Aspartato/genética , Receptores Odorantes/genética , Alineación de Secuencia
8.
Eur J Med Genet ; 65(1): 104402, 2022 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-34863918

RESUMEN

Almost half of all individuals affected by intellectual disability (ID) remain undiagnosed. In the Solve-RD project, exome sequencing (ES) datasets from unresolved individuals with (syndromic) ID (n = 1,472 probands) are systematically reanalyzed, starting from raw sequencing files, followed by genome-wide variant calling and new data interpretation. This strategy led to the identification of a disease-causing de novo missense variant in TUBB3 in a girl with severe developmental delay, secondary microcephaly, brain imaging abnormalities, high hypermetropia, strabismus and short stature. Interestingly, the TUBB3 variant could only be identified through reanalysis of ES data using a genome-wide variant calling approach, despite being located in protein coding sequence. More detailed analysis revealed that the position of the variant within exon 5 of TUBB3 was not targeted by the enrichment kit, although consistent high-quality coverage was obtained at this position, resulting from nearby targets that provide off-target coverage. In the initial analysis, variant calling was restricted to the exon targets ± 200 bases, allowing the variant to escape detection by the variant calling algorithm. This phenomenon may potentially occur more often, as we determined that 36 established ID genes have robust off-target coverage in coding sequence. Moreover, within these regions, for 17 genes (likely) pathogenic variants have been identified before. Therefore, this clinical report highlights that, although compute-intensive, performing genome-wide variant calling instead of target-based calling may lead to the detection of diagnostically relevant variants that would otherwise remain unnoticed.


Asunto(s)
Discapacidad Intelectual/genética , Tubulina (Proteína)/genética , Adolescente , Encéfalo/anomalías , Discapacidades del Desarrollo/genética , Cara/anomalías , Femenino , Humanos , Microcefalia/genética , Mutación Missense , Estrabismo/genética , Secuenciación del Exoma
9.
Brain Commun ; 4(2): fcac030, 2022.
Artículo en Inglés | MEDLINE | ID: mdl-35310830

RESUMEN

Spinocerebellar ataxias consist of a highly heterogeneous group of inherited movement disorders clinically characterized by progressive cerebellar ataxia variably associated with additional distinctive clinical signs. The genetic heterogeneity is evidenced by the myriad of associated genes and underlying genetic defects identified. In this study, we describe a new spinocerebellar ataxia subtype in nine members of a Spanish five-generation family from Menorca with affected individuals variably presenting with ataxia, nystagmus, dysarthria, polyneuropathy, pyramidal signs, cerebellar atrophy and distinctive cerebral demyelination. Affected individuals presented with horizontal and vertical gaze-evoked nystagmus and hyperreflexia as initial clinical signs, and a variable age of onset ranging from 12 to 60 years. Neurophysiological studies showed moderate axonal sensory polyneuropathy with altered sympathetic skin response predominantly in the lower limbs. We identified the c.1877C > T (p.Ser626Leu) pathogenic variant within the SAMD9L gene as the disease causative genetic defect with a significant log-odds score (Z max = 3.43; θ = 0.00; P < 3.53 × 10-5). We demonstrate the mitochondrial location of human SAMD9L protein, and its decreased levels in patients' fibroblasts in addition to mitochondrial perturbations. Furthermore, mutant SAMD9L in zebrafish impaired mobility and vestibular/sensory functions. This study describes a novel spinocerebellar ataxia subtype caused by SAMD9L mutation, SCA49, which triggers mitochondrial alterations pointing to a role of SAMD9L in neurological motor and sensory functions.

10.
Genes (Basel) ; 12(12)2021 12 13.
Artículo en Inglés | MEDLINE | ID: mdl-34946927

RESUMEN

Homozygous deletions (HDs) may be the cause of rare diseases and cancer, and their discovery in targeted sequencing is a challenging task. Different tools have been developed to disentangle HD discovery but a sensitive caller is still lacking. We present VarGenius-HZD, a sensitive and scalable algorithm that leverages breadth-of-coverage for the detection of rare homozygous and hemizygous single-exon deletions (HDs). To assess its effectiveness, we detected both real and synthetic rare HDs in fifty exomes from the 1000 Genomes Project obtaining higher sensitivity in comparison with state-of-the-art algorithms that each missed at least one event. We then applied our tool on targeted sequencing data from patients with Inherited Retinal Dystrophies and solved five cases that still lacked a genetic diagnosis. We provide VarGenius-HZD either stand-alone or integrated within our recently developed software, enabling the automated selection of samples using the internal database. Hence, it could be extremely useful for both diagnostic and research purposes.


Asunto(s)
Variaciones en el Número de Copia de ADN/genética , Análisis de Secuencia de ADN/métodos , Eliminación de Secuencia/genética , Algoritmos , Animales , Secuencia de Bases/genética , Exoma/genética , Exones/genética , Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Humanos
11.
PLoS One ; 16(10): e0258766, 2021.
Artículo en Inglés | MEDLINE | ID: mdl-34653234

RESUMEN

Angelman syndrome (AS) is a neurogenetic disorder characterized by severe developmental delay with absence of speech, happy disposition, frequent laughter, hyperactivity, stereotypies, ataxia and seizures with specific EEG abnormalities. There is a 10-15% of patients with an AS phenotype whose genetic cause remains unknown (Angelman-like syndrome, AS-like). Whole-exome sequencing (WES) was performed on a cohort of 14 patients with clinical features of AS and no molecular diagnosis. As a result, we identified 10 de novo and 1 X-linked pathogenic/likely pathogenic variants in 10 neurodevelopmental genes (SYNGAP1, VAMP2, TBL1XR1, ASXL3, SATB2, SMARCE1, SPTAN1, KCNQ3, SLC6A1 and LAS1L) and one deleterious de novo variant in a candidate gene (HSF2). Our results highlight the wide genetic heterogeneity in AS-like patients and expands the differential diagnosis.


Asunto(s)
Síndrome de Angelman/genética , Secuenciación del Exoma/métodos , Redes Reguladoras de Genes , Adolescente , Adulto , Niño , Femenino , Estudios de Asociación Genética , Predisposición Genética a la Enfermedad , Proteínas de Choque Térmico , Humanos , Lactante , Masculino , Proteínas de Unión a la Región de Fijación a la Matriz/genética , Receptores Citoplasmáticos y Nucleares/genética , Proteínas Represoras/genética , Factores de Transcripción/genética , Proteína 2 de Membrana Asociada a Vesículas/genética , Adulto Joven
12.
J Neurol ; 267(12): 3643-3649, 2020 Dec.
Artículo en Inglés | MEDLINE | ID: mdl-32656641

RESUMEN

BACKGROUND: Behr syndrome is a clinically distinct, but genetically heterogeneous disorder characterized by optic atrophy, progressive spastic paraparesis, and motor neuropathy often associated with ataxia. The molecular diagnosis is based on gene panel testing or whole-exome/genome sequencing. METHODS: Here, we report the clinical presentation of two siblings with a novel genetic form of Behr syndrome. We performed whole-exome sequencing in the two patients and their mother. RESULTS: Both patients had a childhood-onset, slowly progressive disease resembling Behr syndrome, starting with visual impairment, followed by progressive spasticity, weakness, and atrophy of the lower legs and ataxia. They also developed scoliosis, leading to respiratory problems. In their late 30's, both siblings developed a hypertrophic cardiomyopathy and died of sudden cardiac death at age 43 and 40, respectively. Whole-exome sequencing identified the novel homozygous c.627_629del; p.(Gly210del) deletion in UCHL1. CONCLUSIONS: The presentation of our patients raises the possibility that hypertrophic cardiomyopathy may be an additional feature of the clinical syndrome associated with UCHL1 mutations, and highlights the importance of cardiac follow-up and treatment in neurodegenerative disease associated with UCHL1 mutations.


Asunto(s)
Cardiomiopatía Hipertrófica , Enfermedades Neurodegenerativas , Atrofia Óptica , Paraplejía Espástica Hereditaria , Ataxia , Niño , Pérdida Auditiva , Humanos , Discapacidad Intelectual , Mutación/genética , Atrofia Óptica/congénito , Atrofia Óptica/genética , Linaje , Espasmo , Ubiquitina Tiolesterasa
13.
F1000Res ; 92020.
Artículo en Inglés | MEDLINE | ID: mdl-34367618

RESUMEN

Copy number variations (CNVs) are major causative contributors both in the genesis of genetic diseases and human neoplasias. While "High-Throughput" sequencing technologies are increasingly becoming the primary choice for genomic screening analysis, their ability to efficiently detect CNVs is still heterogeneous and remains to be developed. The aim of this white paper is to provide a guiding framework for the future contributions of ELIXIR's recently established human CNV Community, with implications beyond human disease diagnostics and population genomics. This white paper is the direct result of a strategy meeting that took place in September 2018 in Hinxton (UK) and involved representatives of 11 ELIXIR Nodes. The meeting led to the definition of priority objectives and tasks, to address a wide range of CNV-related challenges ranging from detection and interpretation to sharing and training. Here, we provide suggestions on how to align these tasks within the ELIXIR Platforms strategy, and on how to frame the activities of this new ELIXIR Community in the international context.


Asunto(s)
Biología Computacional , Variaciones en el Número de Copia de ADN , Variaciones en el Número de Copia de ADN/genética , Secuenciación de Nucleótidos de Alto Rendimiento , Humanos
14.
Mol Genet Genomic Med ; 7(1): e00511, 2019 01.
Artículo en Inglés | MEDLINE | ID: mdl-30548424

RESUMEN

BACKGROUND: Patients affected by Angelman syndrome (AS) present severe intellectual disability, lack of speech, ataxia, seizures, abnormal electroencephalography (EEG), and a characteristic behavioral phenotype. Around 10% of patients with a clinical diagnosis of AS (AS-like) do not have an identifiable molecular defect. Some of these patients harbor alternative genetic defects that present overlapping features with AS. METHODS: Trio whole-exome sequence was performed on patient and parent's DNA extracted from peripheral blood. Exome data were filtered according to a de novo autosomal dominant inheritance. cDNA analysis was carried out to assess the effect of the splice site variant. RESULTS: We identified a novel heterozygous SMARCE1 splicing variant that leads to an exon skipping in a patient with an Angelman-like phenotype. Missense variants in the SMARCE1 gene are known to cause Coffin-Siris syndrome (CSS), which is a rare congenital syndrome. Clinical reevaluation of the patient confirmed the presence of characteristic clinical features of CSS, many of them overlapping with AS. CONCLUSIONS: Taking into account the novel finding reported in this study, we consider that CSS should be added to the expanding list of differential diagnoses for AS.


Asunto(s)
Síndrome de Angelman/genética , Proteínas Cromosómicas no Histona/genética , Proteínas de Unión al ADN/genética , Fenotipo , Adolescente , Síndrome de Angelman/patología , Exoma , Humanos , Masculino , Mutación Missense , Empalme del ARN
15.
Cancers (Basel) ; 11(3)2019 03 13.
Artículo en Inglés | MEDLINE | ID: mdl-30871259

RESUMEN

Colorectal cancer (CRC) shows aggregation in some families but no alterations in the known hereditary CRC genes. We aimed to identify new candidate genes which are potentially involved in germline predisposition to familial CRC. An integrated analysis of germline and tumor whole-exome sequencing data was performed in 18 unrelated CRC families. Deleterious single nucleotide variants (SNV), short insertions and deletions (indels), copy number variants (CNVs) and loss of heterozygosity (LOH) were assessed as candidates for first germline or second somatic hits. Candidate tumor suppressor genes were selected when alterations were detected in both germline and somatic DNA, fulfilling Knudson's two-hit hypothesis. Somatic mutational profiling and signature analysis were also performed. A series of germline-somatic variant pairs were detected. In all cases, the first hit was presented as a rare SNV/indel, whereas the second hit was either a different SNV (3 genes) or LOH affecting the same gene (141 genes). BRCA2, BLM, ERCC2, RECQL, REV3L and RIF1 were among the most promising candidate genes for germline CRC predisposition. The identification of new candidate genes involved in familial CRC could be achieved by our integrated analysis. Further functional studies and replication in additional cohorts are required to confirm the selected candidates.

16.
Genome Biol Evol ; 5(2): 457-67, 2013.
Artículo en Inglés | MEDLINE | ID: mdl-23377868

RESUMEN

Large-scale evolutionary studies often require the automated construction of alignments of a large number of homologous gene families. The majority of eukaryotic genes can produce different transcripts due to alternative splicing or transcription initiation, and many such transcripts encode different protein isoforms. As analyses tend to be gene centered, one single-protein isoform per gene is selected for the alignment, with the de facto approach being to use the longest protein isoform per gene (Longest), presumably to avoid including partial sequences and to maximize sequence information. Here, we show that this approach is problematic because it increases the number of indels in the alignments due to the inclusion of nonhomologous regions, such as those derived from species-specific exons, increasing the number of misaligned positions. With the aim of ameliorating this problem, we have developed a novel heuristic, Protein ALignment Optimizer (PALO), which, for each gene family, selects the combination of protein isoforms that are most similar in length. We examine several evolutionary parameters inferred from alignments in which the only difference is the method used to select the protein isoform combination: Longest, PALO, the combination that results in the highest sequence conservation, and a randomly selected combination. We observe that Longest tends to overestimate both nonsynonymous and synonymous substitution rates when compared with PALO, which is most likely due to an excess of misaligned positions. The estimation of the fraction of genes that have experienced positive selection by maximum likelihood is very sensitive to the method of isoform selection employed, both when alignments are constructed with MAFFT and with Prank(+F). Longest performs better than a random combination but still estimates up to 3 times more positively selected genes than the combination showing the highest conservation, indicating the presence of many false positives. We show that PALO can eliminate the majority of such false positives and thus that it is a more appropriate approach for large-scale analyses than Longest. A web server has been set up to facilitate the use of PALO given a user-defined set of gene families; it is available at http://evolutionarygenomics.imim.es/palo.


Asunto(s)
Evolución Molecular , Selección Genética/genética , Alineación de Secuencia/métodos , Análisis de Secuencia de ADN/métodos , Algoritmos , Secuencia Conservada/genética , Genoma , Internet , Filogenia , Isoformas de Proteínas/genética , Programas Informáticos , Especificidad de la Especie
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA