Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 14.599
Filtrar
1.
BMC Genom Data ; 25(1): 44, 2024 May 07.
Artículo en Inglés | MEDLINE | ID: mdl-38714950

RESUMEN

BACKGROUND: China has thousands years of goat breeding and abundant goat genetic resources. Additionally, the Hainan black goat is one of the high-quality local goat breeds in China. In order to conserve the germplasm resources of the Hainan black goat, facilitate its genetic improvement and further protect the genetic diversity of goats, it is urgent to develop a single nucleotide polymorphism (SNP) chip for Hainan black goat. RESULTS: In this study, we aimed to design a 10K liquid chip for Hainan black goat based on genotyping by pinpoint sequencing of liquid captured targets (cGPS). A total of 45,588 candidate SNP sites were obtained, 10,677 of which representative SNP sites were selected to design probes, which finally covered 9,993 intervals and formed a 10K cGPS liquid chip for Hainan black goat. To verify the 10K cGPS liquid chip, some southern Chinese goat breeds and a sheep breed with similar phenotype to the Hainan black goat were selected. A total of 104 samples were used to verify the clustering ability of the 10K cGPS liquid chip for Hainan black goat. The results showed that the detection rate of sites was 97.34% -99.93%. 84.5% of SNP sites were polymorphic. The heterozygosity rate was 3.08%-36.80%. The depth of more than 99.4% sites was above 10X. The repetition rate was 99.66%-99.82%. The average consistency between cGPS liquid chip results and resequencing results was 85.58%. In addition, the phylogenetic tree clustering analysis verified that the SNP sites on the chip had better clustering ability. CONCLUSION: These results indicate that we have successfully realized the development and verification of the 10K cGPS liquid chip for Hainan black goat, which provides a useful tool for the genome analysis of Hainan black goat. Moreover, the 10K cGPS liquid chip is conducive to the research and protection of Hainan black goat germplasm resources and lays a solid foundation for its subsequent breeding work.


Asunto(s)
Cabras , Análisis de Secuencia por Matrices de Oligonucleótidos , Polimorfismo de Nucleótido Simple , Animales , Cabras/genética , Polimorfismo de Nucleótido Simple/genética , Análisis de Secuencia por Matrices de Oligonucleótidos/métodos , China , Técnicas de Genotipaje/métodos , Genotipo , Análisis de Secuencia de ADN/métodos , Cruzamiento/métodos
2.
BMC Plant Biol ; 24(1): 306, 2024 Apr 22.
Artículo en Inglés | MEDLINE | ID: mdl-38644480

RESUMEN

Linkage maps are essential for genetic mapping of phenotypic traits, gene map-based cloning, and marker-assisted selection in breeding applications. Construction of a high-quality saturated map requires high-quality genotypic data on a large number of molecular markers. Errors in genotyping cannot be completely avoided, no matter what platform is used. When genotyping error reaches a threshold level, it will seriously affect the accuracy of the constructed map and the reliability of consequent genetic studies. In this study, repeated genotyping of two recombinant inbred line (RIL) populations derived from crosses Yangxiaomai × Zhongyou 9507 and Jingshuang 16 × Bainong 64 was used to investigate the effect of genotyping errors on linkage map construction. Inconsistent data points between the two replications were regarded as genotyping errors, which were classified into three types. Genotyping errors were treated as missing values, and therefore the non-erroneous data set was generated. Firstly, linkage maps were constructed using the two replicates as well as the non-erroneous data set. Secondly, error correction methods implemented in software packages QTL IciMapping (EC) and Genotype-Corrector (GC) were applied to the two replicates. Linkage maps were therefore constructed based on the corrected genotypes and then compared with those from the non-erroneous data set. Simulation study was performed by considering different levels of genotyping errors to investigate the impact of errors and the accuracy of error correction methods. Results indicated that map length and marker order differed among the two replicates and the non-erroneous data sets in both RIL populations. For both actual and simulated populations, map length was expanded as the increase in error rate, and the correlation coefficient between linkage and physical maps became lower. Map quality can be improved by repeated genotyping and error correction algorithm. When it is impossible to genotype the whole mapping population repeatedly, 30% would be recommended in repeated genotyping. The EC method had a much lower false positive rate than did the GC method under different error rates. This study systematically expounded the impact of genotyping errors on linkage analysis, providing potential guidelines for improving the accuracy of linkage maps in the presence of genotyping errors.


Asunto(s)
Mapeo Cromosómico , Genotipo , Triticum , Triticum/genética , Mapeo Cromosómico/métodos , Sitios de Carácter Cuantitativo , Ligamiento Genético , Técnicas de Genotipaje/métodos , Análisis de Secuencia por Matrices de Oligonucleótidos/métodos
3.
Epigenetics ; 19(1): 2333660, 2024 Dec.
Artículo en Inglés | MEDLINE | ID: mdl-38564759

RESUMEN

DNA methylation (DNAm) plays a crucial role in a number of complex diseases. However, the reliability of DNAm levels measured using Illumina arrays varies across different probes. Previous research primarily assessed probe reliability by comparing duplicate samples between the 450k-450k or 450k-EPIC platforms, with limited investigations on Illumina EPIC v1.0 arrays. We conducted a comprehensive assessment of the EPIC v1.0 array probe reliability using 69 blood DNA samples, each measured twice, generated by the Alzheimer's Disease Neuroimaging Initiative study. We observed higher reliability in probes with average methylation beta values of 0.2 to 0.8, and lower reliability in type I probes or those within the promoter and CpG island regions. Importantly, we found that probe reliability has significant implications in the analyses of Epigenome-wide Association Studies (EWAS). Higher reliability is associated with more consistent effect sizes in different studies, the identification of differentially methylated regions (DMRs) and methylation quantitative trait locus (mQTLs), and significant correlations with downstream gene expression. Moreover, blood DNAm measurements obtained from probes with higher reliability are more likely to show concordance with brain DNAm measurements. Our findings, which provide crucial reliability information for probes on the EPIC v1.0 array, will serve as a valuable resource for future DNAm studies.


Asunto(s)
Metilación de ADN , Sitios de Carácter Cuantitativo , Análisis de Secuencia por Matrices de Oligonucleótidos/métodos , Reproducibilidad de los Resultados , Islas de CpG
4.
Genes (Basel) ; 15(4)2024 Mar 22.
Artículo en Inglés | MEDLINE | ID: mdl-38674328

RESUMEN

Autoimmunity is defined as the inability to regulate immunological activities in the body, especially in response to external triggers, leading to the attack of the tissues and organs of the host. Outcomes include the onset of autoimmune diseases whose effects are primarily due to dysregulated immune responses. In past years, there have been cases that show an increased susceptibility to other autoimmune disorders in patients who are already experiencing the same type of disease. Research in this field has started analyzing the potential molecular and cellular causes of this interconnectedness, bearing in mind the possibility of advancing drugs and therapies for the treatment of autoimmunity. With that, this study aimed to determine the correlation of four autoimmune diseases, which are type 1 diabetes (T1D), psoriasis (PSR), systemic sclerosis (SSc), and systemic lupus erythematosus (SLE), by identifying highly preserved co-expressed genes among datasets using WGCNA. Functional annotation was then employed to characterize these sets of genes based on their systemic relationship as a whole to elucidate the biological processes, cellular components, and molecular functions of the pathways they are involved in. Lastly, drug repurposing analysis was performed to screen candidate drugs for repositioning that could regulate the abnormal expression of genes among the diseases. A total of thirteen modules were obtained from the analysis, the majority of which were associated with transcriptional, post-transcriptional, and post-translational modification processes. Also, the evaluation based on KEGG suggested the possible role of TH17 differentiation in the simultaneous onset of the four diseases. Furthermore, clomiphene was the top drug candidate for regulating overexpressed hub genes; meanwhile, prilocaine was the top drug for regulating under-expressed hub genes. This study was geared towards utilizing transcriptomics approaches for the assessment of microarray data, which is different from the use of traditional genomic analyses. Such a research design for investigating correlations among autoimmune diseases may be the first of its kind.


Asunto(s)
Transducción de Señal , Humanos , Transducción de Señal/genética , Enfermedades Autoinmunes/genética , Enfermedades Autoinmunes/tratamiento farmacológico , Enfermedades Autoinmunes/inmunología , Diabetes Mellitus Tipo 1/genética , Diabetes Mellitus Tipo 1/inmunología , Análisis de Secuencia por Matrices de Oligonucleótidos/métodos , Redes Reguladoras de Genes , Sistema Inmunológico/metabolismo , Esclerodermia Sistémica/genética , Esclerodermia Sistémica/tratamiento farmacológico , Esclerodermia Sistémica/inmunología , Lupus Eritematoso Sistémico/genética , Lupus Eritematoso Sistémico/tratamiento farmacológico , Lupus Eritematoso Sistémico/inmunología , Psoriasis/genética , Psoriasis/tratamiento farmacológico , Psoriasis/inmunología , Perfilación de la Expresión Génica/métodos
5.
Biosens Bioelectron ; 253: 116172, 2024 Jun 01.
Artículo en Inglés | MEDLINE | ID: mdl-38460210

RESUMEN

Simultaneous multiplexed analysis can provide comprehensive information for disease diagnosis. However, the current multiplex methods rely on sophisticated barcode technology, which hinders its wider application. In this study, an ultrasimple size encoding method is proposed for multiplex detection using a wedge-shaped microfluidic chip. Driving by negative pressure, microparticles are naturally arranged in distinct stripes based on their sizes within the chip. This size encoding method demonstrates a high level of precision, allowing for accuracy in distinguishing 3-5 sizes of microparticles with a remarkable accuracy rate of up to 99%, even the microparticles with a size difference as small as 0.5 µm. The entire size encoding process is completed in less than 5 min, making it ultrasimple, reliable, and easy to operate. To evaluate the function of this size encoding microfluidic chip, three commonly co-infectious viruses' nucleic acid sequences (including complementary DNA sequences of HIV and HCV, and DNA sequence of HBV) are employed for multiplex detection. Results indicate that all three DNA sequences can be sensitively detected without any cross-interference. This size-encoding microfluidic chip-based multiplex detection method is simple, rapid, and high-resolution, its successful application in serum samples renders it highly promising for potential clinical promotion.


Asunto(s)
Técnicas Biosensibles , Técnicas Analíticas Microfluídicas , Microfluídica , Secuencia de Bases , Técnicas Analíticas Microfluídicas/métodos , Análisis de Secuencia por Matrices de Oligonucleótidos/métodos
6.
Biomed Microdevices ; 26(2): 20, 2024 Mar 02.
Artículo en Inglés | MEDLINE | ID: mdl-38430318

RESUMEN

Polymerase chain reaction (PCR) has been considered as the gold standard for detecting nucleic acids. The simple PCR system is of great significance for medical applications in remote areas, especially for the developing countries. Herein, we proposed a low-cost self-assembled platform for microchamber PCR. The working principle is rotating the chamber PCR microfluidic chip between two heaters with fixed temperature to solve the problem of low temperature variation rate. The system consists of two temperature controllers, a screw slide rail, a chamber array microfluidic chip and a self-built software. Such a system can be constructed at a cost of about US$60. The micro chamber PCR can be finished by rotating the microfluidic chip between two heaters with fixed temperature. Results demonstrated that the sensitivity of the temperature controller is 0.1℃. The relative error of the duration for the microfluidic chip was 0.02 s. Finally, we successfully finished amplification of the target gene of Porphyromonas gingivalis in the chamber PCR microfluidic chip within 35 min and on-site detection of its PCR products by fluorescence. The chip consisted of 3200 cylindrical chambers. The volume of reagent in each volume is as low as 0.628 nL. This work provides an effective method to reduce the amplification time required for micro chamber PCR.


Asunto(s)
Microfluídica , Microfluídica/métodos , Temperatura , Análisis de Secuencia por Matrices de Oligonucleótidos/métodos , Reacción en Cadena de la Polimerasa/métodos
7.
Nat Commun ; 15(1): 1366, 2024 Feb 14.
Artículo en Inglés | MEDLINE | ID: mdl-38355558

RESUMEN

Efficient pathogen enrichment and nucleic acid isolation are critical for accurate and sensitive diagnosis of infectious diseases, especially those with low pathogen levels. Our study introduces a biporous silica nanofilms-embedded sample preparation chip for pathogen and nucleic acid enrichment/isolation. This chip features unique biporous nanostructures comprising large and small pore layers. Computational simulations confirm that these nanostructures enhance the surface area and promote the formation of nanovortex, resulting in improved capture efficiency. Notably, the chip demonstrates a 100-fold lower limit of detection compared to conventional methods used for nucleic acid detection. Clinical validations using patient samples corroborate the superior sensitivity of the chip when combined with the luminescence resonance energy transfer assay. The enhanced sample preparation efficiency of the chip, along with the facile and straightforward synthesis of the biporous nanostructures, offers a promising solution for polymer chain reaction-free detection of nucleic acids.


Asunto(s)
Nanoestructuras , Ácidos Nucleicos , Humanos , Microfluídica , Dióxido de Silicio , Análisis de Secuencia por Matrices de Oligonucleótidos/métodos , Técnicas de Amplificación de Ácido Nucleico
8.
Comput Biol Med ; 170: 108089, 2024 Mar.
Artículo en Inglés | MEDLINE | ID: mdl-38330824

RESUMEN

Gene selection is a process of selecting discriminative genes from microarray data that helps to diagnose and classify cancer samples effectively. Swarm intelligence evolution-based gene selection algorithms can never circumvent the problem that the population is prone to local optima in the process of gene selection. To tackle this challenge, previous research has focused primarily on two aspects: mitigating premature convergence to local optima and escaping from local optima. In contrast to these strategies, this paper introduces a novel perspective by adopting reverse thinking, where the issue of local optima is seen as an opportunity rather than an obstacle. Building on this foundation, we propose MOMOGS-PCE, a novel gene selection approach that effectively exploits the advantageous characteristics of populations trapped in local optima to uncover global optimal solutions. Specifically, MOMOGS-PCE employs a novel population initialization strategy, which involves the initialization of multiple populations that explore diverse orientations to foster distinct population characteristics. The subsequent step involved the utilization of an enhanced NSGA-II algorithm to amplify the advantageous characteristics exhibited by the population. Finally, a novel exchange strategy is proposed to facilitate the transfer of characteristics between populations that have reached near maturity in evolution, thereby promoting further population evolution and enhancing the search for more optimal gene subsets. The experimental results demonstrated that MOMOGS-PCE exhibited significant advantages in comprehensive indicators compared with six competitive multi-objective gene selection algorithms. It is confirmed that the "reverse-thinking" approach not only avoids local optima but also leverages it to uncover superior gene subsets for cancer diagnosis.


Asunto(s)
Algoritmos , Neoplasias , Humanos , Neoplasias/diagnóstico , Neoplasias/genética , Análisis de Secuencia por Matrices de Oligonucleótidos/métodos
9.
Nucleic Acids Res ; 52(7): e38, 2024 Apr 24.
Artículo en Inglés | MEDLINE | ID: mdl-38407446

RESUMEN

The Infinium BeadChip is the most widely used DNA methylome assay technology for population-scale epigenome profiling. However, the standard workflow requires over 200 ng of input DNA, hindering its application to small cell-number samples, such as primordial germ cells. We developed experimental and analysis workflows to extend this technology to suboptimal input DNA conditions, including ultra-low input down to single cells. DNA preamplification significantly enhanced detection rates to over 50% in five-cell samples and ∼25% in single cells. Enzymatic conversion also substantially improved data quality. Computationally, we developed a method to model the background signal's influence on the DNA methylation level readings. The modified detection P-value calculation achieved higher sensitivities for low-input datasets and was validated in over 100 000 public diverse methylome profiles. We employed the optimized workflow to query the demethylation dynamics in mouse primordial germ cells available at low cell numbers. Our data revealed nuanced chromatin states, sex disparities, and the role of DNA methylation in transposable element regulation during germ cell development. Collectively, we present comprehensive experimental and computational solutions to extend this widely used methylation assay technology to applications with limited DNA.


Asunto(s)
Metilación de ADN , Análisis de la Célula Individual , Animales , Femenino , Humanos , Masculino , Ratones , Islas de CpG , ADN/genética , ADN/metabolismo , Epigenómica/métodos , Células Germinativas/metabolismo , Análisis de Secuencia por Matrices de Oligonucleótidos/métodos , Análisis de la Célula Individual/métodos
10.
Comput Biol Chem ; 109: 108009, 2024 Apr.
Artículo en Inglés | MEDLINE | ID: mdl-38219419

RESUMEN

Many soft biclustering algorithms have been developed and applied to various biological and biomedical data analyses. However, few mutually exclusive (hard) biclustering algorithms have been proposed, which could better identify disease or molecular subtypes with survival significance based on genomic or transcriptomic data. In this study, we developed a novel mutually exclusive spectral biclustering (MESBC) algorithm based on spectral method to detect mutually exclusive biclusters. MESBC simultaneously detects relevant features (genes) and corresponding conditions (patients) subgroups and, therefore, automatically uses the signature features for each subtype to perform the clustering. Extensive simulations revealed that MESBC provided superior accuracy in detecting pre-specified biclusters compared with the non-negative matrix factorization (NMF) and Dhillon's algorithm, particularly in very noisy data. Further analysis of the algorithm on real datasets obtained from the TCGA database showed that MESBC provided more accurate (i.e., smaller p-value) overall survival prediction in patients with lung adenocarcinoma (LUAD) and lung squamous cell carcinoma (LUSC) cancers when compared to the existing, gold-standard subtypes for lung cancers (integrative clustering). Furthermore, MESBC detected several genes with significant prognostic value in both LUAD and LUSC patients. External validation on an independent, unseen GEO dataset of LUAD showed that MESBC-derived clusters based on TCGA data still exhibited clear biclustering patterns and consistent, outstanding prognostic predictability, demonstrating robust generalizability of MESBC. Therefore, MESBC could potentially be used as a risk stratification tool to optimize the treatment for the patient, improve the selection of patients for clinical trials, and contribute to the development of novel therapeutic agents.


Asunto(s)
Adenocarcinoma del Pulmón , Carcinoma de Pulmón de Células no Pequeñas , Carcinoma de Células Escamosas , Neoplasias Pulmonares , Humanos , Análisis de Secuencia por Matrices de Oligonucleótidos/métodos , Perfilación de la Expresión Génica/métodos , Algoritmos , Neoplasias Pulmonares/genética
11.
J Comput Biol ; 31(1): 71-82, 2024 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-38010511

RESUMEN

The analysis of gene expression data has made significant contributions to understanding disease mechanisms and developing new drugs and therapies. In such analysis, gene selection is often required for identifying informative and relevant genes and removing redundant and irrelevant ones. However, this is not an easy task as gene expression data have inherent challenges such as ultra-high dimensionality, biological noise, and measurement errors. This study focuses on the measurement errors in gene selection problems. Typically, high-throughput experiments have their own intrinsic measurement errors, which can result in an increase of falsely discovered genes. To alleviate this problem, this study proposes a gene selection method that takes into account measurement errors using generalized liner measurement error models. The method consists of iterative filtering and selection steps until convergence, leading to fewer false positives and providing stable results under measurement errors. The performance of the proposed method is demonstrated through simulation studies and applied to a lung cancer data set.


Asunto(s)
Perfilación de la Expresión Génica , Perfilación de la Expresión Génica/métodos , Análisis de Secuencia por Matrices de Oligonucleótidos/métodos , Simulación por Computador
12.
Commun Biol ; 6(1): 1151, 2023 11 13.
Artículo en Inglés | MEDLINE | ID: mdl-37953348

RESUMEN

The function of regulatory elements is highly dependent on the cellular context, and thus for understanding the function of elements associated with psychiatric diseases these would ideally be studied in neurons in a living brain. Massively Parallel Reporter Assays (MPRAs) are molecular genetic tools that enable functional screening of hundreds of predefined sequences in a single experiment. These assays have not yet been adapted to query specific cell types in vivo in a complex tissue like the mouse brain. Here, using a test-case 3'UTR MPRA library with genomic elements containing variants from autism patients, we developed a method to achieve reproducible measurements of element effects in vivo in a cell type-specific manner, using excitatory cortical neurons and striatal medium spiny neurons as test cases. This targeted technique should enable robust, functional annotation of genetic elements in the cellular contexts most relevant to psychiatric disease.


Asunto(s)
Análisis de Secuencia por Matrices de Oligonucleótidos , Secuencias Reguladoras de Ácidos Nucleicos , Animales , Humanos , Ratones , Análisis de Secuencia por Matrices de Oligonucleótidos/métodos , Regiones no Traducidas 3' , Corteza Cerebral , Neuronas Espinosas Medianas
13.
BMC Bioinformatics ; 24(1): 408, 2023 Oct 30.
Artículo en Inglés | MEDLINE | ID: mdl-37904108

RESUMEN

BACKGROUND: Gene-wise differential expression is usually the first major step in the statistical analysis of high-throughput data obtained from techniques such as microarrays or RNA-sequencing. The analysis at gene level is often complemented by interrogating the data in a broader biological context that considers as unit of measure groups of genes that may have a common function or biological trait. Among the vast number of publications about gene set analysis (GSA), the rotation test for gene set analysis, also referred to as roast, is a general sample randomization approach that maintains the integrity of the intra-gene set correlation structure in defining the null distribution of the test. RESULTS: We present roastgsa, an R package that contains several enrichment score functions that feed the roast algorithm for hypothesis testing. These implemented methods are evaluated using both simulated and benchmarking data in microarray and RNA-seq datasets. We find that computationally intensive measures based on Kolmogorov-Smirnov (KS) statistics fail to improve the rates of simpler measures of GSA like mean and maxmean scores. We also show the importance of accounting for the gene linear dependence structure of the testing set, which is linked to the loss of effective signature size. Complete graphical representation of the results, including an approximation for the effective signature size, can be obtained as part of the roastgsa output. CONCLUSIONS: We encourage the usage of the absmean (non-directional), mean (directional) and maxmean (directional) scores for roast GSA analysis as these are simple measures of enrichment that have presented dominant results in all provided analyses in comparison to the more complex KS measures.


Asunto(s)
Algoritmos , Perfilación de la Expresión Génica , Perfilación de la Expresión Génica/métodos , Rotación , Análisis de Secuencia por Matrices de Oligonucleótidos/métodos , Fenotipo
14.
Anal Chem ; 95(41): 15384-15393, 2023 10 17.
Artículo en Inglés | MEDLINE | ID: mdl-37801728

RESUMEN

Glass is by far the most common substrate for biomolecular arrays, including high-throughput sequencing flow cells and microarrays. The native glass hydroxyl surface is modified by using silane chemistry to provide appropriate functional groups and reactivities for either in situ synthesis or surface immobilization of biologically or chemically synthesized biomolecules. These arrays, typically of oligonucleotides or peptides, are then subjected to long incubation times in warm aqueous buffers prior to fluorescence readout. Under these conditions, the siloxy bonds to the glass are susceptible to hydrolysis, resulting in significant loss of biomolecules and concomitant loss of signal from the assay. Here, we demonstrate that functionalization of glass surfaces with dipodal silanes results in greatly improved stability compared to equivalent functionalization with standard monopodal silanes. Using photolithographic in situ synthesis of DNA, we show that dipodal silanes are compatible with phosphoramidite chemistry and that hybridization performed on the resulting arrays provides greatly improved signal and signal-to-noise ratios compared with surfaces functionalized with monopodal silanes.


Asunto(s)
Ensayos Analíticos de Alto Rendimiento , Silanos , Análisis de Secuencia por Matrices de Oligonucleótidos/métodos , Silanos/química , Hibridación de Ácido Nucleico/métodos , ADN/química , Vidrio/química , Propiedades de Superficie
15.
PLoS One ; 18(8): e0289971, 2023.
Artículo en Inglés | MEDLINE | ID: mdl-37561760

RESUMEN

As breast cancer is a multistage progression disease resulting from a genetic sequence of mutations, understanding the genes whose expression values increase or decrease monotonically across pathologic stages can provide insightful clues about how breast cancer initiates and advances. Utilizing variational autoencoder (VAE) networks in conjunction with traditional statistical testing, we successfully ascertain long non-coding RNAs (lncRNAs) that exhibit monotonically differential expression values in breast cancer. Subsequently, we validate that the identified lncRNAs really present monotonically changed patterns. The proposed procedure identified 248 monotonically decreasing expressed and 115 increasing expressed lncRNAs. They correspond to a total of 65 and 33 genes respectively, which possess unique known gene symbols. Some of them are associated with breast cancer, as suggested by previous studies. Furthermore, enriched pathways by the target mRNAs of these identified lncRNAs include the Wnt signaling pathway, human papillomavirus (HPV) infection, and Rap 1 signaling pathway, which have been shown to play crucial roles in the initiation and development of breast cancer. Additionally, we trained a VAE model using the entire dataset. To assess the effectiveness of the identified lncRNAs, a microarray dataset was employed as the test set. The results obtained from this evaluation were deemed satisfactory. In conclusion, further experimental validation of these lncRNAs with a large-sized study is warranted, and the proposed procedure is highly recommended.


Asunto(s)
Neoplasias de la Mama , ARN Largo no Codificante , Humanos , Femenino , ARN Largo no Codificante/genética , ARN Largo no Codificante/metabolismo , Neoplasias de la Mama/genética , Análisis de Secuencia por Matrices de Oligonucleótidos/métodos , Vía de Señalización Wnt , ARN Mensajero/metabolismo , Perfilación de la Expresión Génica
16.
Brief Bioinform ; 24(4)2023 07 20.
Artículo en Inglés | MEDLINE | ID: mdl-37419612

RESUMEN

Missing values (MVs) can adversely impact data analysis and machine-learning model development. We propose a novel mixed-model method for missing value imputation (MVI). This method, ProJect (short for Protein inJection), is a powerful and meaningful improvement over existing MVI methods such as Bayesian principal component analysis (PCA), probabilistic PCA, local least squares and quantile regression imputation of left-censored data. We rigorously tested ProJect on various high-throughput data types, including genomics and mass spectrometry (MS)-based proteomics. Specifically, we utilized renal cancer (RC) data acquired using DIA-SWATH, ovarian cancer (OC) data acquired using DIA-MS, bladder (BladderBatch) and glioblastoma (GBM) microarray gene expression dataset. Our results demonstrate that ProJect consistently performs better than other referenced MVI methods. It achieves the lowest normalized root mean square error (on average, scoring 45.92% less error in RC_C, 27.37% in RC_full, 29.22% in OC, 23.65% in BladderBatch and 20.20% in GBM relative to the closest competing method) and the Procrustes sum of squared error (Procrustes SS) (exhibits 79.71% less error in RC_C, 38.36% in RC full, 18.13% in OC, 74.74% in BladderBatch and 30.79% in GBM compared to the next best method). ProJect also leads with the highest correlation coefficient among all types of MV combinations (0.64% higher in RC_C, 0.24% in RC full, 0.55% in OC, 0.39% in BladderBatch and 0.27% in GBM versus the second-best performing method). ProJect's key strength is its ability to handle different types of MVs commonly found in real-world data. Unlike most MVI methods that are designed to handle only one type of MV, ProJect employs a decision-making algorithm that first determines if an MV is missing at random or missing not at random. It then employs targeted imputation strategies for each MV type, resulting in more accurate and reliable imputation outcomes. An R implementation of ProJect is available at https://github.com/miaomiao6606/ProJect.


Asunto(s)
Algoritmos , Genómica , Teorema de Bayes , Análisis de Secuencia por Matrices de Oligonucleótidos/métodos , Espectrometría de Masas/métodos
17.
IEEE/ACM Trans Comput Biol Bioinform ; 20(5): 2802-2809, 2023.
Artículo en Inglés | MEDLINE | ID: mdl-37285246

RESUMEN

Biclustering algorithms are essential for processing gene expression data. However, to process the dataset, most biclustering algorithms require preprocessing the data matrix into a binary matrix. Regrettably, this type of preprocessing may introduce noise or cause information loss in the binary matrix, which would reduce the biclustering algorithm's ability to effectively obtain the optimal biclusters. In this paper, we propose a new preprocessing method named Mean-Standard Deviation (MSD) to resolve the problem. Additionally, we introduce a new biclustering algorithm called Weight Adjacency Difference Matrix Binary Biclustering (W-AMBB) to effectively process datasets containing overlapping biclusters. The basic idea is to create a weighted adjacency difference matrix by applying weights to a binary matrix that is derived from the data matrix. This allows us to identify genes with significant associations in sample data by efficiently identifying similar genes that respond to specific conditions. Furthermore, the performance of the W-AMBB algorithm was tested on both synthetic and real datasets and compared with other classical biclustering methods. The experiment results demonstrate that the W-AMBB algorithm is significantly more robust than the compared biclustering methods on the synthetic dataset. Additionally, the results of the GO enrichment analysis show that the W-AMBB method possesses biological significance on real datasets.


Asunto(s)
Algoritmos , Perfilación de la Expresión Génica , Perfilación de la Expresión Génica/métodos , Análisis de Secuencia por Matrices de Oligonucleótidos/métodos , Análisis por Conglomerados , Expresión Génica
18.
Methods Mol Biol ; 2639: 69-81, 2023.
Artículo en Inglés | MEDLINE | ID: mdl-37166711

RESUMEN

In biology, molecular cascade signaling is an essential tool to mediate various pathways and downstream behaviors. Mimicking these molecular cascades plays an important role in synthetic biology. The use of DNA self-assembly represents an elegant way to build sophisticated molecular cascades. For instance, a DNA molecular array connected by a number of dynamic anti-junction units was able to realize prescribed, multistep, long-range cascaded transformation. The dynamic DNA molecular array is able to execute transformations with programmable initiation, propagation, and regulation. The transformation of the array can be initiated at selected units and then propagated, without addition of extra triggers, to neighboring units and eventually the entire array.


Asunto(s)
ADN , Nanotecnología , ADN/genética , Análisis de Secuencia por Matrices de Oligonucleótidos/métodos , Nanotecnología/métodos
19.
Forensic Sci Int Genet ; 65: 102885, 2023 07.
Artículo en Inglés | MEDLINE | ID: mdl-37137205

RESUMEN

Since the arrest of the Golden State Killer in the US in April 2018, forensic geneticists have been increasingly interested in the investigative genetic genealogy (IGG) method. While this method has already been in practical use as a powerful tool for criminal investigation, we have yet to know well the limitations and potential risks. In this current study, we performed an evaluation study focusing on degraded DNA using the Affymetrix Genome-Wide Human SNP Array 6.0 platform (Thermo Fisher Scientific). We revealed one of the potential problems that occur during SNP genotype determination using a microarray-based platform. Our analysis results indicated that the SNP profiles derived from degraded DNA contained many false heterozygous SNPs. In addition, it was confirmed that the total amount of probe signal intensity on microarray chips derived from degraded DNA decreased significantly. Because the conventional analysis algorithm performs normalization during genotype determination, we concluded that noise signals could be genotype-called. To address this issue, we proposed a novel microarray data analysis method without normalization (nMAP). Although the nMAP algorithm resulted in a low call rate, it substantially improved genotyping accuracy. Finally, we confirmed the usefulness of the nMAP algorithm for kinship inferences. These findings and the nMAP algorithm will make a contribution to the advance of the IGG method.


Asunto(s)
ADN , Inmunoglobulina G , Humanos , Genotipo , Análisis de Secuencia por Matrices de Oligonucleótidos/métodos , ADN/genética , Inmunoglobulina G/genética , Polimorfismo de Nucleótido Simple
20.
BMC Bioinformatics ; 24(1): 150, 2023 Apr 17.
Artículo en Inglés | MEDLINE | ID: mdl-37069540

RESUMEN

BACKGROUND: Gene expression profiling is a widely adopted method in areas like drug development or functional gene analysis. Microarray data of gene expression experiments is still commonly used and widely available for retrospective analyses. However, due to to changes of the underlying technologies data sets from different technologies are often difficult to compare and thus a multitude of already available data becomes difficult to use. We present a web application that abstracts away mathematical and programmatical details in order to enable a convenient and customizable analysis of microarray data for large-scale reproducibility studies. In addition, the web application provides a feature that allows easy access to large microarray repositories. RESULTS: Our web application consists of three basic steps which are necessary for a differential gene expression analysis as well as Gene Ontology (GO) enrichment analysis and the comparison of multiple analysis results. Genealyzer can handle Affymetrix data as well as one-channel and two-channel Agilent data. All steps are visualized with meaningful plots. The application offers flexible analysis while being intuitively operable. CONCLUSIONS: Our web application provides a unified platform for analysing microarray data, while allowing users to compare the results of different technologies and organisms. Beyond reproducibility, this also offers many possibilities for gaining further insights from existing study data, especially since data from different technologies or organisms can also be compared. The web application can be accessed via this URL: https://genealyzer.item.fraunhofer.de/ . Login credentials can be found at the end.


Asunto(s)
Perfilación de la Expresión Génica , Programas Informáticos , Análisis de Secuencia por Matrices de Oligonucleótidos/métodos , Reproducibilidad de los Resultados , Estudios Retrospectivos , Perfilación de la Expresión Génica/métodos , Expresión Génica , Internet
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA
...