RESUMO
Genomic effect variants associated with survival and protection against complex diseases vary between populations due to microevolutionary processes. The aim of this study was to analyse diversity and distribution of effect variants in a context of potential positive selection. In total, 475 individuals of Lithuanian origin were genotyped using high-throughput scanning and/or sequencing technologies. Allele frequency analysis for the pre-selected effect variants was performed using the catalogue of single nucleotide polymorphisms. Comparison of the pre-selected effect variants with variants in primate species was carried out to ascertain which allele was derived and potentially of protective nature. Recent positive selection analysis was performed to verify this protective effect. Four variants having significantly different frequencies compared to European populations were identified while two other variants reached borderline significance. Effect variant in SLC30A8 gene may potentially protect against type 2 diabetes. The existing paradox of high rates of type 2 diabetes in the Lithuanian population and the relatively high frequencies of potentially protective genome variants against it indicate a lack of knowledge about the interactions between environmental factors, regulatory regions, and other genome variation. Identification of effect variants is a step towards better understanding of the microevolutionary processes, etiopathogenetic mechanisms, and personalised medicine.
RESUMO
Background and Objectives: Heterozygous pathogenic variants in the MED13L gene cause impaired intellectual development and distinctive facial features with or without cardiac defects (MIM #616789). This complex neurodevelopmental disorder is characterised by various phenotypic features, including plagiocephaly, strabismus, clubfoot, poor speech, and developmental delay. The aim of this study was to evaluate the clinical significance and consequences of a novel heterozygous intragenic MED13L deletion in a proband with clinical features of a MED13L-related disorder through extensive clinical, molecular, and functional characterisation. Materials and Methods: Combined comparative genomic hybridisation and single-nucleotide polymorphism array (SNP-CGH) was used to identify the changes in the proband's gDNA sequence (DECIPHER #430183). Intragenic MED13L deletion was specified via quantitative polymerase chain reaction (qPCR) and Sanger sequencing of the proband's cDNA sample. Western blot and bioinformatics analyses were used to investigate the consequences of this copy number variant (CNV) at the protein level. CRISPR-Cas9 technology was used for a MED13L-gene-silencing experiment in a culture of the control individual's skin fibroblasts. After the MED13L-gene-editing experiment, subsequent functional fibroblast culture analyses were performed. Results: The analysis of the proband's cDNA sample allowed for specifying the regions of the breakpoints and identifying the heterozygous deletion that spanned exons 3 to 10 of MED13L, which has not been reported previously. In silico, the deletion was predicted to result in a truncated protein NP_056150.1:p.(Val104Glyfs*5), partly altering the Med13_N domain and losing the MedPIWI and Med13_C domains. After MED13L gene editing was performed, reduced cell viability; an accelerated aging process; and inhibition of the RB1, E2F1, and CCNC gene expression were found to exist. Conclusions: Based on these findings, heterozygous intragenic 12q24.21 deletion in the affected individual resulted in MED13L haploinsufficiency due to the premature termination of protein translation, therefore leading to MED13L haploinsufficiency syndrome.
Assuntos
Haploinsuficiência , Deficiência Intelectual , Humanos , Haploinsuficiência/genética , Deficiência Intelectual/genética , Fenótipo , DNA Complementar , Síndrome , Complexo Mediador/genéticaRESUMO
Background and Objectives: Pathogenic variants of PIGN are a known cause of multiple congenital anomalies-hypotonia-seizures syndrome 1 (MCAHS1). Many affected individuals have clinical features overlapping with Fryns syndrome and are mainly characterised by developmental delay, congenital anomalies, hypotonia, seizures, and specific minor facial anomalies. This study investigates the clinical and molecular data of three individuals from two unrelated families, the clinical features of which were consistent with a diagnosis of MCAHS1. Materials and Methods: Next-generation sequencing (NGS) technology was used to identify the changes in the DNA sequence. Sanger sequencing of gDNA of probands and their parents was used for validation and segregation analysis. Bioinformatics tools were used to investigate the consequences of pathogenic or likely pathogenic PIGN variants at the protein sequence and structure level. Results: The analysis of NGS data and segregation analysis revealed a compound heterozygous NM_176787.5:c.[1942G>T];[1247_1251del] PIGN genotype in family 1 and NG_033144.1(NM_176787.5):c.[932T>G];[1674+1G>C] PIGN genotype in family 2. In silico, c.1942G>T (p.(Glu648Ter)), c.1247_1251del (p.(Glu416GlyfsTer22)), and c.1674+1G>C (p.(Glu525AspfsTer68)) variants are predicted to result in a premature termination codon that leads to truncated and functionally disrupted protein causing the phenotype of MCAHS1 in the affected individuals. Conclusions: PIGN-related disease represents a wide spectrum of phenotypic features, making clinical diagnosis inaccurate and complicated. The genetic testing of every individual with this phenotype provides new insights into the origin and development of the disease.
Assuntos
Deformidades Congênitas dos Membros , Hipotonia Muscular , Humanos , Hipotonia Muscular/genética , Hipotonia Muscular/patologia , Lituânia , Fosfotransferases/genética , Convulsões , Síndrome , Mutação , LinhagemRESUMO
Background and Objectives: The pathogenic variants of SLC9A6 are a known cause of a rare, X-linked neurological disorder called Christianson syndrome (CS). The main characteristics of CS are developmental delay, intellectual disability, and neurological findings. This study investigated the genetic basis and explored the molecular changes that led to CS in two male siblings presenting with intellectual disability, epilepsy, behavioural problems, gastrointestinal dysfunction, poor height, and weight gain. Materials and Methods: Next-generation sequencing of a tetrad was applied to identify the DNA changes and Sanger sequencing of proband's cDNA was used to evaluate the impact of a splice site variant on mRNA structure. Bioinformatical tools were used to investigate SLC9A6 protein structure changes. Results: Sequencing and bioinformatical analysis revealed a novel donor splice site variant (NC_000023.11(NM_001042537.1):c.899 + 1G > A) that leads to a frameshift and a premature stop codon. Protein structure modelling showed that the truncated protein is unlikely to form any functionally relevant SLC9A6 dimers. Conclusions: Molecular and bioinformatical analysis revealed the impact of a novel donor splice site variant in the SLC9A6 gene that leads to truncated and functionally disrupted protein causing the phenotype of CS in the affected individuals.
Assuntos
Epilepsia , Deficiência Intelectual , Microcefalia , Ataxia , Epilepsia/genética , Doenças Genéticas Ligadas ao Cromossomo X , Humanos , Deficiência Intelectual/genética , Lituânia , Masculino , Microcefalia/genética , Transtornos da Motilidade OcularRESUMO
BACKGROUND: Autosomal recessive limb-girdle muscular dystrophy-1 (LGMDR1), also known as calpainopathy, is a genetically heterogeneous disorder characterised by progression of muscle weakness. Homozygous or compound heterozygous variants in the CAPN3 gene are known genetic causes of this condition. The aim of this study was to confirm the molecular consequences of the CAPN3 variant NG_008660.1(NM_000070.3):c.1746-20C > G of an individual with suspected LGMDR1 by extensive complementary DNA (cDNA) analysis. CASE PRESENTATION: In the present study, we report on a male with proximal muscular weakness in his lower limbs. Compound heterozygous NM_000070.3:c.598_612del and NG_008660.1(NM_000070.3):c.1746-20C > G genotype was detected on the CAPN3 gene by targeted next-generation sequencing (NGS). To confirm the pathogenicity of the variant c.1746-20C > G, we conducted genetic analysis based on Sanger sequencing of the proband's cDNA sample. The results revealed that this splicing variant disrupts the original 3' splice site on intron 13, thus leading to the skipping of the DNA fragment involving exon 14 and possibly exon 15. However, the lack of exon 15 in the CAPN3 isoforms present in a blood sample was explained by cell-specific alternative splicing rather than an aberrant splicing mechanism. In silico the c.1746-20C > G splicing variant consequently resulted in frameshift and formation of a premature termination codon (NP_000061.1:p.(Glu582Aspfs*62)). CONCLUSIONS: Based on the results of our study and the literature we reviewed, both c.598_612del and c.1746-20C > G variants are pathogenic and together cause LGMDR1. Therefore, extensive mRNA and/or cDNA analysis of splicing variants is critical to understand the pathogenesis of the disease.
Assuntos
Calpaína , Distrofia Muscular do Cíngulo dos Membros , Calpaína/genética , Homozigoto , Humanos , Masculino , Proteínas Musculares/genética , Distrofia Muscular do Cíngulo dos Membros/genética , MutaçãoRESUMO
Biallelic pathogenic variants in POMK gene are associated with two types of dystroglycanopathies: limb-girdle muscular dystrophy-dystroglycanopathy, type C12 (MDDGC12), and congenital muscular dystrophy-dystroglycanopathy with brain and eye anomalies, type A12 (MDDGA12). These disorders are very rare and have been previously reported in 10 affected individuals. We present two unrelated Lithuanian families with prenatally detected hydrocephalus due to a homozygous nonsense variant in the POMK. The first signs of hydrocephalus in the affected fetuses became evident at 15 weeks of gestation and rapidly progressed, thus these clinical features are compatible with a diagnosis of MDDGA12. The association between pathogenic POMK variants and macrocephaly and severe hydrocephalus has been previously reported only in two families. Clinical and molecular findings presented in this report highlight congenital hydrocephalus as a distinct feature of POMK related disorders and a differentiator from other dystroglycanopathies. These findings further extend the spectrum of MDDGA12 syndrome.
Assuntos
Distrofia Muscular do Cíngulo dos Membros/diagnóstico , Distrofia Muscular do Cíngulo dos Membros/genética , Malformações do Sistema Nervoso/diagnóstico , Proteínas Quinases/genética , Adulto , Encéfalo/diagnóstico por imagem , Encéfalo/patologia , Códon sem Sentido/genética , Feminino , Homozigoto , Humanos , Recém-Nascido , Masculino , Distrofia Muscular do Cíngulo dos Membros/diagnóstico por imagem , Distrofia Muscular do Cíngulo dos Membros/patologia , Mutação/genética , Malformações do Sistema Nervoso/diagnóstico por imagem , Malformações do Sistema Nervoso/genética , Malformações do Sistema Nervoso/patologia , Linhagem , Gravidez , Ultrassonografia Pré-NatalRESUMO
BACKGROUND: CHARGE syndrome (MIM# 214800)-which is characterised by a number of congenital anomalies including coloboma, ear anomalies, deafness, facial anomalies, heart defects, atresia choanae, genital hypoplasia, growth retardation, and developmental delay-is caused by a heterozygous variant in the CHD7 (MIM# 608892) gene located on chromosome 8q12. We report the identification of a novel c.5535-1G > A variant in CHD7 and provide the evaluation of its effect on pre-mRNA splicing. CASE PRESENTATION: In this study, we report on a female presenting features of CHARGE syndrome. A novel heterozygous CHD7 variant c.5535-1G > A located in the acceptor splice site of intron 26 was identified in the proband's DNA sample after analysis of whole exome sequencing data. In silico predictions indicating that the variant is probably pathogenic by affecting pre-mRNA splicing were verified by genetic analysis based on reverse transcription of the patient's RNA followed by PCR amplifications performed on synthesised cDNA and Sanger sequencing. Sanger sequencing of cDNA revealed that the c.5535-1G > A variant disrupts the original acceptor splice site and activates a cryptic splice site only one nucleotide downstream of the pathogenic variant site. This change causes the omission of the first nucleotide of exon 27, leading to a frameshift in the mRNA of the CHD7 gene. Our results suggest that the alteration induces the premature truncation of the CHD7 protein (UniProtKB: Q9P2D1), thus resulting in CHARGE syndrome. CONCLUSION: Genetic analysis of novel splice site variant underlines its importance for studying the pathogenic splicing mechanism as well as for confirming a diagnosis.
Assuntos
Síndrome CHARGE/genética , DNA Helicases/genética , Proteínas de Ligação a DNA/genética , Predisposição Genética para Doença/genética , Sítios de Splice de RNA , Adolescente , Sequência de Aminoácidos , Sequência de Bases , Síndrome CHARGE/diagnóstico por imagem , Síndrome CHARGE/fisiopatologia , Feminino , Mutação da Fase de Leitura , Estudos de Associação Genética , Heterozigoto , Humanos , Íntrons , Mutação , Splicing de RNA , RNA Mensageiro , Alinhamento de Sequência , Osso Temporal/diagnóstico por imagem , Sequenciamento do ExomaRESUMO
Next-generation sequencing (NGS) became an effective approach for finding novel causative genomic variants of genetic disorders and is increasingly used for diagnostic purposes. Public variant databases that gather data of pathogenic variants are being relied upon as a source for clinical diagnosis. However, research of pathogenic variants using public databases data could be carried out not only in patients, but also in healthy people. This could provide insights into the most common recessive disorders in populations. The study aim was to use NGS and data from the ClinVar database for the identification of pathogenic variants in the exomes of healthy individuals from the Lithuanian population. To achieve this, 96 exomes were sequenced. An average of 42 139 single-nucleotide variants (SNVs) and 2306 short INDELs were found in each individual exome. Pooled data of study exomes provided a total of 243 192 unique SNVs and 31 623 unique short INDELs. Three hundred and twenty-one unique SNVs were classified as pathogenic. Comparison of the European data from the 1000 Genomes Project with our data revealed five pathogenic genomic variants that are inherited in an autosomal recessive pattern and that statistically significantly differ from the European population data.
Assuntos
Sequenciamento de Nucleotídeos em Larga Escala/métodos , Análise de Sequência de DNA/métodos , Bases de Dados de Ácidos Nucleicos , Exoma , Variação Genética/genética , Genoma Humano , Genômica , Humanos , Mutação INDEL/genética , Mutação , Polimorfismo de Nucleotídeo Único/genéticaRESUMO
BACKGROUND: Congenital hearing loss (CHL) is diagnosed in 1 - 2 newborns in 1000, genetic factors contribute to two thirds of CHL cases in industrialised countries. Mutations of the GJB2 gene located in the DFNB1 locus (13q11-12) are a major cause of CHL worldwide. The aim of this cross-sectional study was to assess the contribution of the DFNB1 locus containing the GJB2 and GJB6 genes in the development of early onset hearing loss in the affected group of participants, to determine the population-specific mutational profile and DFNB1-related HL burden in Lithuanian population. METHODS: Clinical data were obtained from a collection of 158 affected participants (146 unrelated probands) with early onset non-syndromic HL. GJB2 and GJB6 gene sequencing and GJB6 gene deletion testing were performed. The data of GJB2 and GJB6 gene sequencing in 98 participants in group of self-reported healthy Lithuanian inhabitants were analysed. Statistic summary, homogeneity tests, and logistic regression analysis were used for the assessment of genotype-phenotype correlation. RESULTS: Our findings show 57.5% of affected participants with two pathogenic GJB2 gene mutations identified. The most prevalent GJB2 mutations were c.35delG, p. (Gly12Valfs*2) (rs80338939) and c.313_326del14, p. (Lys105Glyfs*5) (rs111033253) with allele frequencies 64.7% and 28.3% respectively. GJB6 gene mutations were not identified in the affected group of participants. The statistical analysis revealed significant differences between GJB2(-) and GJB2(+) groups in disease severity (p = 0.001), and family history (p = 0.01). The probability of identification of GJB2 mutations in patients with various HL characteristics was estimated. The carrier rate of GJB2 gene mutations - 7.1% (~1 in 14) was identified in the group of healthy participants and a high frequency of GJB2-related hearing loss was estimated in our population. DISCUSSION: The results show a very high proportion of GJB2-positive individuals in the research group affected with sensorineural HL. The allele frequency of c.35delG mutation (64.7 %) is consistent with many previously published studies in groups of affected individuals of Caucasian populations. The high frequency of the c.313_326del14 (28.3 % of pathogenic alleles) mutation in affected group of participants was an unexpected finding in our study suggesting not only a high frequency of carriers of this mutation in our population but also its possible origin in Lithuanian ancestors. The high frequency of carriers of the c.313_326del14 mutation in the entire Lithuanian population is supported by it being identified twice in the ethnic Lithuanian group of healthy participants (a frequency 2.0 % of carriers in the study group). CONCLUSION: Analysis of the allele frequency of GJB2 gene mutations revealed a high proportion of c. 313_326del14 (rs111033253) mutations in the GJB2-positive group suggesting its possible origin in Lithuanian forebears. The high frequency of carriers of GJB2 gene mutations in the group of healthy participants corresponds to the substantial frequency of GJB2-associated HL in Lithuania. The observations of the study indicate the significant contribution of GJB2 gene mutations to the pathogenesis of the disorder in the Lithuanian population and will contribute to introducing principles to predict the characteristics of the disease in patients.
Assuntos
Conexinas/genética , Perda Auditiva Neurossensorial/genética , População Branca/genética , Alelos , Pré-Escolar , Conexina 26 , Estudos Transversais , Feminino , Deleção de Genes , Frequência do Gene , Estudos de Associação Genética , Loci Gênicos , Perda Auditiva Neurossensorial/diagnóstico , Humanos , Lituânia , Modelos Logísticos , Masculino , Mutação , Análise de Sequência de DNARESUMO
A high number of genome variants are associated with complex traits, mainly due to genome-wide association studies (GWAS). Using polygenic risk scores (PRSs) is a widely accepted method for calculating an individual's complex trait prognosis using such data. Unlike monogenic traits, the practical implementation of complex traits by applying this method still falls behind. Calculating PRSs from all GWAS data has limited practical usability in behaviour traits due to statistical noise and the small effect size from a high number of genome variants involved. From a behaviour traits perspective, complex traits are explored using the concept of core genes from an omnigenic model, aiming to employ a simplified calculation version. Simplification may reduce the accuracy compared to a complete PRS encompassing all trait-associated variants. Integrating genome data with datasets from various disciplines, such as IT and psychology, could lead to better complex trait prediction. This review elucidates the significance of clear biological pathways in understanding behaviour traits. Specifically, it highlights the essential role of genes related to hormones, enzymes, and neurotransmitters as robust core genes in shaping these traits. Significant variations in core genes are prominently observed in behaviour traits such as stress response, impulsivity, and substance use.
Assuntos
Estudo de Associação Genômica Ampla , Genômica , Comportamento Impulsivo , Herança Multifatorial/genética , FenótipoRESUMO
Cybersecurity (CS) is a contemporary field for research and applied study of a range of aspects from across multiple disciplines. A cybersecurity expert has an in-depth knowledge of technology but is often also recognized for the ability to view technology in a non-standard way. This paper explores how CS specialists are both a combination of professional computing-based skills and genetically encoded traits. Almost every human behavioral trait is a result of many genome variants in action altogether with environmental factors. The review focuses on contextualizing the behavior genetics aspects in the application of cybersecurity. It reconsiders methods that help to identify aspects of human behavior from the genetic information. And stress is an illustrative factor to start the discussion within the community on what methodology should be used in an ethical way to approach those questions. CS positions are considered stressful due to the complexity of the domain and the social impact it can have in cases of failure. An individual risk profile could be created combining known genome variants linked to a trait of particular behavior using a special biostatistical approach such as a polygenic score. These revised advancements bring challenging possibilities in the applications of human behavior genetics and CS.
RESUMO
Hemizygosity of the MIR17HG gene encoding the miR-17 ~ 92 cluster is associated with Feingold syndrome 2 characterized by intellectual disability, skeletal abnormalities, short stature, and microcephaly. Here, we report on a female with a de novo 13q31.3 microduplication encompassing MIR17HG but excluding GPC5. She presented developmental delay, skeletal and digital abnormalities, and features such as tall stature and macrocephaly mirroring those of Feingold syndrome 2 patients. The limited extent of the proband's rearrangement to the miR cluster and the corresponding normal expression level of the neighboring GPC5 in her cells, together with previously described data on affected individuals of two families carrying overlapping duplications of the miR-17 ~ 92 cluster that comprise part of GPC5, who likewise presented macrocephaly, developmental delay, as well as skeletal, digital and stature abnormalities, allow to define a new syndrome due to independent microduplication of the miR-17 ~ 92 cluster.
Assuntos
Transtornos Cromossômicos/genética , Pálpebras/anormalidades , Deficiência Intelectual/genética , Deformidades Congênitas dos Membros/genética , MicroRNAs/genética , Microcefalia/genética , Fístula Traqueoesofágica/genética , Adolescente , Deleção Cromossômica , Cromossomos Humanos Par 13/genética , Hibridização Genômica Comparativa/métodos , Deficiências do Desenvolvimento/genética , Nanismo/genética , Feminino , Duplicação Gênica/genética , Glipicanas/genética , Glipicanas/metabolismo , Humanos , FenótipoRESUMO
BACKGROUND: Preaxial polydactyly type IV, also referred as polysyndactyly, has been described in a few syndromes. We present three generations of a family with preaxial polydactyly type IV and other clinical features of Greig cephalopolysyndactyly syndrome (GCPS). METHODS AND RESULTS: Sequencing analysis of the GLI3 coding region identified a novel donor splice site variant NC_000007.14(NM_000168.6):c.473+3A>T in the proband and the same pathogenic variant was subsequently identified in other affected family members. Functional analysis based on Sanger sequencing of the proband's complementary DNA (cDNA) sample revealed that the splice site variant c.473+3A>T disrupts the original donor splice site, thus leading to exon 4 skipping. Based on further in silico analysis, this pathogenic splice site variant consequently results in a truncated protein NP_000159.3:p.(His123Argfs*57), which lacks almost all functionally important domains. Therefore, functional cDNA analysis confirmed that the haploinsufficiency of the GLI3 is the cause of GCPS in the affected family members. CONCLUSION: Despite the evidence provided, pathogenic variants in the GLI3 do not always definitely correlate with syndromic or nonsyndromic clinical phenotypes associated with this gene. For this reason, further transcriptomic and proteomic evaluation could be suggested.
Assuntos
Acrocefalossindactilia/genética , Predisposição Genética para Doença/genética , Proteínas do Tecido Nervoso/genética , Proteína Gli3 com Dedos de Zinco/genética , Acrocefalossindactilia/diagnóstico por imagem , Acrocefalossindactilia/fisiopatologia , Criança , DNA Complementar , Feminino , Humanos , Pessoa de Meia-Idade , Mutação , Proteínas do Tecido Nervoso/metabolismo , Linhagem , Fenótipo , Proteômica , Análise de Sequência de DNA , Transcriptoma , Proteína Gli3 com Dedos de Zinco/metabolismoRESUMO
BACKGROUND: Alcohol use disorder (AUD) is a chronic relapsing brain disease characterized by compulsive alcohol use, loss of control over alcohol intake, and a negative emotional state when not using (1). Abusive alcohol consumption directly affects a person's physical and psychological health and social life. The World Health Organization has shown that Lithuania is a leading country in pure alcohol consumption in the world (2). The aim of this study is to find novel genome variants that are associated with the AUD in the Lithuanian cohort. MATERIALS AND METHODS: A case-control study included 294 individuals of Lithuanian ethnicity, who were divided into two groups based on their habits of alcohol use. Single nucleotide polymorphism array analysis was performed using Illumina HiScanSQ™ genome analyzer. RESULTS: Our study showed that rs686141T>C variant in NALCN gene is more prevalent in the non-drinker group compared to the alcohol drinker group (relative allele frequency, respectively: 0.38 and 0.27, OR = 0.60 (CI 95% 0.37-0.98), p = 0.0408). Meanwhile, rs6354C>A, in SLC6A4 gene, variant's genotype distribution showed statistically significant difference between the non-drinker and alcohol drinker group (distribution of genotypes in the case group: 9/72/172 (CC/CA/AA) and in the control group: 5/7/29, p = 0.0264). CONCLUSION: We analyzed 23 genes associated with AUD and identified two novel genome variants (rs686141T>C and rs6354C>A). The study shows that genome analysis is an important tool for AUD research. The results supplement the known information about genes associated with AUD.
RESUMO
BACKGROUND: Intellectual disability affects about 1-2% of the general population worldwide, and this is the leading socio-economic problem of health care. The evaluation of the genetic causes of intellectual disability is challenging because these conditions are genetically heterogeneous with many different genetic alterations resulting in clinically indistinguishable phenotypes. Genome wide molecular technologies are effective in a research setting for establishing the new genetic basis of a disease. We describe the first Lithuanian experience in genome-wide CNV detection and whole exome sequencing, presenting the results obtained in the research project UNIGENE. MATERIALS AND METHODS: The patients with developmental delay/intellectual disability have been investigated (n = 66). Diagnostic screening was performed using array-CGH technology. FISH and real time-PCR were used for the confirmation of gene-dose imbalances and investigation of parental samples. Whole exome sequencing using the next generation high throughput NGS technique was used to sequence the samples of 12 selected families. RESULTS: 14 out of 66 patients had pathogenic copy number variants, and one patient had novel likely pathogenic aberration (microdeletion at 4p15.2). Twelve families have been processed for whole exome sequencing. Two identified sequence variants could be classified as pathogenic (in MECP2, CREBBP genes). The other families had several candidate intellectual disability gene variants that are of unclear clinical significance and must be further investigated for possible effect on the molecular pathways of intellectual disability. CONCLUSIONS: The genetic heterogeneity of intellectual disability requires genome wide approaches, including detection of chromosomal aberrations by chromosomal microarrays and whole exome sequencing capable of uncovering single gene mutations. This study demonstrates the benefits and challenges that accompany the use of genome wide molecular technologies and provides genotype-phenotype information on 32 patients with chromosomal imbalances and ID candidate sequence variants.
RESUMO
BACKGROUND: Every next generation sequencing (NGS) platform relies on proprietary and open source computational tools to analyze sequencing data. NGS tools for Illumina platforms are well documented which is not the case with AB SOLiD systems. We applied several computational and variant calling pipelines to analyse targeted exome sequencing data obtained using AB SOLiD 5500 system. Our investigated tools comprised proprietary LifeScope's pipeline in combination with open source color-space competent mapping programs and a variant caller. We present instrumental details of the pipelines that were used and quantitative comparative analysis of variant lists generated by LifeScope's pipeline versus open source tools. RESULTS: Sufficient coverage of targeted regions was achieved by all investigated pipelines. High variability was observed in identities of variants across the mapping programs. We observed less than 50% concordance of variant lists produced by approaches based on different mapping algorithms. We summarized different approaches with regards to coverage (DP) and quality (QUAL) properties of the variants provided by GATK and found that LifeScope's computational pipeline is superior. Fusion of information on mapping profiles (pileup) at genomic positions of variants in several different alignments proved to be a useful strategy to assess questionable singleton variants. CONCLUSIONS: We quantitatively supported a conclusion that Lifescope's pipeline is superior for processing sequencing data obtained by AB SOLiD 5500 system. Nevertheless the use of alternative pipelines is encouraged because aggregation of information from other mapping and variant calling approaches helps to resolve questionable calls and increases the confidence of the call. It was noted that a coverage threshold for variant to be considered for further analysis has to be chosen in data-driven way to prevent a loss of important information.