Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 39
Filtrar
1.
JCI Insight ; 9(8)2024 Mar 21.
Artigo em Inglês | MEDLINE | ID: mdl-38512356

RESUMO

BACKGROUNDNovel biomarkers to identify infectious patients transmitting Mycobacterium tuberculosis are urgently needed to control the global tuberculosis (TB) pandemic. We hypothesized that proteins released into the plasma in active pulmonary TB are clinically useful biomarkers to distinguish TB cases from healthy individuals and patients with other respiratory infections.METHODSWe applied a highly sensitive non-depletion tandem mass spectrometry discovery approach to investigate plasma protein expression in pulmonary TB cases compared to healthy controls in South African and Peruvian cohorts. Bioinformatic analysis using linear modeling and network correlation analyses identified 118 differentially expressed proteins, significant through 3 complementary analytical pipelines. Candidate biomarkers were subsequently analyzed in 2 validation cohorts of differing ethnicity using antibody-based proximity extension assays.RESULTSTB-specific host biomarkers were confirmed. A 6-protein diagnostic panel, comprising FETUB, FCGR3B, LRG1, SELL, CD14, and ADA2, differentiated patients with pulmonary TB from healthy controls and patients with other respiratory infections with high sensitivity and specificity in both cohorts.CONCLUSIONThis biomarker panel exceeds the World Health Organization Target Product Profile specificity criteria for a triage test for TB. The new biomarkers have potential for further development as near-patient TB screening assays, thereby helping to close the case-detection gap that fuels the global pandemic.FUNDINGMedical Research Council (MRC) (MR/R001065/1, MR/S024220/1, MR/P023754/1, and MR/W025728/1); the MRC and the UK Foreign Commonwealth and Development Office; the UK National Institute for Health Research (NIHR); the Wellcome Trust (094000, 203135, and CC2112); Starter Grant for Clinical Lecturers (Academy of Medical Sciences UK); the British Infection Association; the Program for Advanced Research Capacities for AIDS in Peru at Universidad Peruana Cayetano Heredia (D43TW00976301) from the Fogarty International Center at the US NIH; the UK Technology Strategy Board/Innovate UK (101556); the Francis Crick Institute, which receives funding from UKRI-MRC (CC2112); Cancer Research UK (CC2112); and the NIHR Biomedical Research Centre of Imperial College NHS.


Assuntos
Biomarcadores , Proteômica , Tuberculose Pulmonar , Humanos , Biomarcadores/sangue , Proteômica/métodos , Masculino , Feminino , Adulto , Tuberculose Pulmonar/diagnóstico , Tuberculose Pulmonar/sangue , Mycobacterium tuberculosis , Pessoa de Meia-Idade , Peru/epidemiologia , África do Sul/epidemiologia , Estudos de Casos e Controles , Sensibilidade e Especificidade
3.
Bioinformatics ; 40(2)2024 02 01.
Artigo em Inglês | MEDLINE | ID: mdl-38310333

RESUMO

MOTIVATION: Protein language models (PLMs), which borrowed ideas for modelling and inference from natural language processing, have demonstrated the ability to extract meaningful representations in an unsupervised way. This led to significant performance improvement in several downstream tasks. Clustering amino acids based on their physical-chemical properties to achieve reduced alphabets has been of interest in past research, but their application to PLMs or folding models is unexplored. RESULTS: Here, we investigate the efficacy of PLMs trained on reduced amino acid alphabets in capturing evolutionary information, and we explore how the loss of protein sequence information impacts learned representations and downstream task performance. Our empirical work shows that PLMs trained on the full alphabet and a large number of sequences capture fine details that are lost in alphabet reduction methods. We further show the ability of a structure prediction model(ESMFold) to fold CASP14 protein sequences translated using a reduced alphabet. For 10 proteins out of the 50 targets, reduced alphabets improve structural predictions with LDDT-Cα differences of up to 19%. AVAILABILITY AND IMPLEMENTATION: Trained models and code are available at github.com/Ieremie/reduced-alph-PLM.


Assuntos
Dobramento de Proteína , Proteínas , Proteínas/química , Aminoácidos/química , Sequência de Aminoácidos , Aminas
4.
Sensors (Basel) ; 23(23)2023 Nov 28.
Artigo em Inglês | MEDLINE | ID: mdl-38067827

RESUMO

Understanding how the human body works during sleep and how this varies in the population is a task with significant implications for medicine. Polysomnographic studies, or sleep studies, are a common diagnostic method that produces a significant quantity of time-series sensor data. This study seeks to learn the causal structure from data from polysomnographic studies carried out on 600 adult volunteers in the United States. Two methods are used to learn the causal structure of these data: the well-established Granger causality and "DYNOTEARS", a modern approach that uses continuous optimisation to learn dynamic Bayesian networks (DBNs). The results from the two methods are then compared. Both methods produce graphs that have a number of similarities, including the mutual causation between electrooculogram (EOG) and electroencephelogram (EEG) signals and between sleeping position and SpO2 (blood oxygen level). However, DYNOTEARS, unlike Granger causality, frequently finds a causal link to sleeping position from the other variables. Following the creation of these causal graphs, the relationship between the discovered causal structure and the characteristics of the participants is explored. It is found that there is an association between the waist size of a participant and whether a causal link is found between the electrocardiogram (ECG) measurement and the EOG and EEG measurements. It is concluded that a person's body shape appears to impact the relationship between their heart and brain during sleep and that Granger causality and DYNOTEARS can produce differing results on real-world data.


Assuntos
Encéfalo , Sono , Adulto , Humanos , Teorema de Bayes , Causalidade
5.
Brief Bioinform ; 24(6)2023 09 22.
Artigo em Inglês | MEDLINE | ID: mdl-37874950

RESUMO

Cluster analysis is a crucial stage in the analysis and interpretation of single-cell gene expression (scRNA-seq) data. It is an inherently ill-posed problem whose solutions depend heavily on hyper-parameter and algorithmic choice. The popular approach of K-means clustering, for example, depends heavily on the choice of K and the convergence of the expectation-maximization algorithm to local minima of the objective. Exhaustive search of the space for multiple good quality solutions is known to be a complex problem. Here, we show that quantum computing offers a solution to exploring the cost function of clustering by quantum annealing, implemented on a quantum computing facility offered by D-Wave [1]. Out formulation extracts minimum vertex cover of an affinity graph to sub-sample the cell population and quantum annealing to optimise the cost function. A distribution of low-energy solutions can thus be extracted, offering alternate hypotheses about how genes group together in their space of expressions.


Assuntos
Metodologias Computacionais , Teoria Quântica , RNA-Seq , Análise de Sequência de RNA , Algoritmos , Análise por Conglomerados , Perfilação da Expressão Gênica
6.
Aging Clin Exp Res ; 35(7): 1449-1457, 2023 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-37202598

RESUMO

BACKGROUND: Osteoarthritis is the most prevalent type of arthritis. Many approaches exist for characterising radiographic knee OA, including machine learning (ML). AIMS: To examine Kellgren and Lawrence (K&L) scores from ML and expert observation, minimum joint space and osteophyte in relation to pain and function. METHODS: Participants from the Hertfordshire Cohort Study, comprising individuals born in Hertfordshire from 1931 to 1939, were analysed. Radiographs were assessed by clinicians and ML (convolutional neural networks) for K&L scoring. Medial minimum joint space and osteophyte area were ascertained using the knee OA computer-aided diagnosis (KOACAD) program. The Western Ontario and McMaster Universities Osteoarthritis Index (WOMAC) was administered. Receiver operating characteristic analysis was implemented for minimum joint space, osteophyte, and observer- and ML-derived K&L scores in relation to pain (WOMAC pain score > 0) and impaired function (WOMAC function score > 0). RESULTS: 359 participants (aged 71-80) were analysed. Among both sexes, discriminative capacity regarding pain and function was fairly high for observer-derived K&L scores [area under curve (AUC): 0.65 (95% CI 0.57, 0.72) to 0.70 (0.63, 0.77)]; results were similar among women for ML-derived K&L scores. Discriminative capacity was moderate among men for minimum joint space in relation to pain [0.60 (0.51, 0.67)] and function [0.62 (0.54, 0.69)]. AUC < 0.60 for other sex-specific associations. DISCUSSION: Observer-derived K&L scores had higher discriminative capacity regarding pain and function compared to minimum joint space and osteophyte. Among women, discriminative capacity was similar for observer- and ML-derived K&L scores. CONCLUSION: ML as an adjunct to expert observation for K&L scoring may be beneficial due to the efficiency and objectivity of ML.


Assuntos
Osteoartrite do Joelho , Osteófito , Masculino , Humanos , Feminino , Osteoartrite do Joelho/diagnóstico por imagem , Estudos de Coortes , Osteófito/diagnóstico por imagem , Articulação do Joelho , Dor , Índice de Gravidade de Doença
7.
Bone ; 168: 116653, 2023 03.
Artigo em Inglês | MEDLINE | ID: mdl-36581259

RESUMO

BACKGROUND: Traditional analysis of High Resolution peripheral Quantitative Computed Tomography (HR-pQCT) images results in a multitude of cortical and trabecular parameters which would be potentially cumbersome to interpret for clinicians compared to user-friendly tools utilising clinical parameters. A computer vision approach (by which the entire scan is 'read' by a computer algorithm) to ascertain fracture risk, would be far simpler. We therefore investigated whether a computer vision and machine learning technique could improve upon selected clinical parameters in assessing fracture risk. METHODS: Participants of the Hertfordshire Cohort Study (HCS) attended research visits at which height and weight were measured; fracture history was determined via self-report and vertebral fracture assessment. Bone microarchitecture was assessed via HR-pQCT scans of the non-dominant distal tibia (Scanco XtremeCT), and bone mineral density measurement and lateral vertebral assessment were performed using dual-energy X-ray absorptiometry (DXA) (Lunar Prodigy Advanced). Images were cropped, pre-processed and texture analysis was performed using a three-dimensional local binary pattern method. These image data, together with age, sex, height, weight, BMI, dietary calcium and femoral neck BMD, were used in a random-forest classification algorithm. Receiver operating characteristic (ROC) analysis was used to compare fracture risk identification methods. RESULTS: Overall, 180 males and 165 females were included in this study with a mean age of approximately 76 years and 97 (28 %) participants had sustained a previous fracture. Using clinical risk factors alone resulted in an area under the curve (AUC) of 0.70 (95 % CI: 0.56-0.84), which improved to 0.71 (0.57-0.85) with the addition of DXA-measured BMD. The addition of HR-pQCT image data to the machine learning classifier with clinical risk factors and DXA-measured BMD as inputs led to an improved AUC of 0.90 (0.83-0.96) with a sensitivity of 0.83 and specificity of 0.74. CONCLUSION: These results suggest that using a three-dimensional computer vision method to HR-pQCT scanning may enhance the identification of those at risk of fracture beyond that afforded by clinical risk factors and DXA-measured BMD. This approach has the potential to make the information offered by HR-pQCT more accessible (and therefore) applicable to healthcare professionals in the clinic if the technology becomes more widely available.


Assuntos
Fraturas Ósseas , Masculino , Feminino , Humanos , Idoso , Absorciometria de Fóton/métodos , Estudos de Coortes , Fraturas Ósseas/diagnóstico por imagem , Densidade Óssea , Fatores de Risco , Colo do Fêmur , Rádio (Anatomia)
8.
J Cheminform ; 14(1): 59, 2022 Sep 01.
Artigo em Inglês | MEDLINE | ID: mdl-36050750

RESUMO

The related problems of chemical reaction optimization and reaction scope search concern the discovery of reaction pathways and conditions that provide the best percentage yield of a target product. The space of possible reaction pathways or conditions is too large to search in full, so identifying a globally optimal set of conditions must instead draw on mathematical methods to identify areas of the space that should be investigated. An intriguing contribution to this area of research is the recent development of the Experimental Design for Bayesian optimization (EDBO) optimizer [1]. Bayesian optimization works by building an approximation to the true function to be optimized based on a small set of simulations, and selecting the next point (or points) to be tested based on an acquisition function reflecting the value of different points within the input space. In this work, we evaluated the robustness of the EDBO optimizer under several changes to its specification. We investigated the effect on the performance of the optimizer of altering the acquisition function and batch size, applied the method to other existing reaction yield data sets, and considered its performance in the new problem domain of molecular power conversion efficiency in photovoltaic cells. Our results indicated that the EDBO optimizer broadly performs well under these changes; of particular note is the competitive performance of the computationally cheaper acquisition function Thompson Sampling when compared to the original Expected Improvement function, and some concerns around the method's performance for "incomplete" input domains.

9.
PLoS One ; 17(6): e0269159, 2022.
Artigo em Inglês | MEDLINE | ID: mdl-35657932

RESUMO

BACKGROUND: It is estimated that up to 50% of all disease causing variants disrupt splicing. Due to its complexity, our ability to predict which variants disrupt splicing is limited, meaning missed diagnoses for patients. The emergence of machine learning for targeted medicine holds great potential to improve prediction of splice disrupting variants. The recently published SpliceAI algorithm utilises deep neural networks and has been reported to have a greater accuracy than other commonly used methods. METHODS AND FINDINGS: The original SpliceAI was trained on splice sites included in primary isoforms combined with novel junctions observed in GTEx data, which might introduce noise and de-correlate the machine learning input with its output. Limiting the data to only validated and manual annotated primary and alternatively spliced GENCODE sites in training may improve predictive abilities. All of these gene isoforms were collapsed (aggregated into one pseudo-isoform) and the SpliceAI architecture was retrained (CI-SpliceAI). Predictive performance on a newly curated dataset of 1,316 functionally validated variants from the literature was compared with the original SpliceAI, alongside MMSplice, MaxEntScan, and SQUIRLS. Both SpliceAI algorithms outperformed the other methods, with the original SpliceAI achieving an accuracy of ∼91%, and CI-SpliceAI showing an improvement at ∼92% overall. Predictive accuracy increased in the majority of curated variants. CONCLUSIONS: We show that including only manually annotated alternatively spliced sites in training data improves prediction of clinically relevant variants, and highlight avenues for further performance improvements.


Assuntos
Sítios de Splice de RNA , Splicing de RNA , Processamento Alternativo , Humanos , Aprendizado de Máquina , Mutação , Redes Neurais de Computação , Sítios de Splice de RNA/genética
10.
Bioinformatics ; 38(8): 2269-2277, 2022 04 12.
Artigo em Inglês | MEDLINE | ID: mdl-35176146

RESUMO

MOTIVATION: Protein-protein interactions (PPIs) play a key role in diverse biological processes but only a small subset of the interactions has been experimentally identified. Additionally, high-throughput experimental techniques that detect PPIs are known to suffer various limitations, such as exaggerated false positives and negatives rates. The semantic similarity derived from the Gene Ontology (GO) annotation is regarded as one of the most powerful indicators for protein interactions. However, while computational approaches for prediction of PPIs have gained popularity in recent years, most methods fail to capture the specificity of GO terms. RESULTS: We propose TransformerGO, a model that is capable of capturing the semantic similarity between GO sets dynamically using an attention mechanism. We generate dense graph embeddings for GO terms using an algorithmic framework for learning continuous representations of nodes in networks called node2vec. TransformerGO learns deep semantic relations between annotated terms and can distinguish between negative and positive interactions with high accuracy. TransformerGO outperforms classic semantic similarity measures on gold standard PPI datasets and state-of-the-art machine-learning-based approaches on large datasets from Saccharomyces cerevisiae and Homo sapiens. We show how the neural attention mechanism embedded in the transformer architecture detects relevant functional terms when predicting interactions. AVAILABILITY AND IMPLEMENTATION: https://github.com/Ieremie/TransformerGO. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Algoritmos , Aprendizado de Máquina , Humanos , Ontologia Genética , Saccharomyces cerevisiae/genética , Anotação de Sequência Molecular , Biologia Computacional/métodos
11.
IEEE/ACM Trans Comput Biol Bioinform ; 19(6): 3340-3352, 2022.
Artigo em Inglês | MEDLINE | ID: mdl-34705655

RESUMO

Recent advances in high throughput technologies have made large amounts of biomedical omics data accessible to the scientific community. Single omic data clustering has proved its impact in the biomedical and biological research fields. Multi-omic data clustering and multi-omic data integration techniques have shown improved clustering performance and biological insight. Cancer subtype clustering is an important task in the medical field to be able to identify a suitable treatment procedure and prognosis for cancer patients. State of the art multi-view clustering methods are based on non-convex objectives which only guarantee non-global solutions that are high in computational complexity. Only a few convex multi-view methods are present. However, their models do not take into account the intrinsic manifold structure of the data. In this paper, we introduce a convex graph regularized multi-view clustering method that is robust to outliers. We compare our algorithm to state of the art convex and non-convex multi-view and single view clustering methods, and show its superiority in clustering cancer subtypes on publicly available cancer genomic datasets from the TCGA repository. We also show our method's better ability to potentially discover cancer subtypes compared to other state of the art multi-view methods.


Assuntos
Multiômica , Neoplasias , Humanos , Genômica/métodos , Algoritmos , Análise por Conglomerados , Neoplasias/genética
12.
Entropy (Basel) ; 23(10)2021 Oct 18.
Artigo em Inglês | MEDLINE | ID: mdl-34682084

RESUMO

In solving challenging pattern recognition problems, deep neural networks have shown excellent performance by forming powerful mappings between inputs and targets, learning representations (features) and making subsequent predictions. A recent tool to help understand how representations are formed is based on observing the dynamics of learning on an information plane using mutual information, linking the input to the representation (I(X;T)) and the representation to the target (I(T;Y)). In this paper, we use an information theoretical approach to understand how Cascade Learning (CL), a method to train deep neural networks layer-by-layer, learns representations, as CL has shown comparable results while saving computation and memory costs. We observe that performance is not linked to information-compression, which differs from observation on End-to-End (E2E) learning. Additionally, CL can inherit information about targets, and gradually specialise extracted features layer-by-layer. We evaluate this effect by proposing an information transition ratio, I(T;Y)/I(X;T), and show that it can serve as a useful heuristic in setting the depth of a neural network that achieves satisfactory accuracy of classification.

13.
Patterns (N Y) ; 2(1): 100162, 2021 Jan 08.
Artigo em Inglês | MEDLINE | ID: mdl-33511363

RESUMO

The Artificial Intelligence and Augmented Intelligence for Automated Investigation for Scientific Discovery Network+ (AI3SD) was established in response to the UK Engineering and Physical Sciences Research Council (EPSRC) late-2017 call for a Network+ to promote cutting-edge research in artificial intelligence to accelerate groundbreaking scientific discoveries. This article provides the philosophical, scientific, and technical underpinnings of the Network+, the history of the different domains represented in the Network+, and the specific focus of the Network+. The activities, collaborations, and research covered in the first year of the Network+ have highlighted the significant challenges in the chemistry and augmented and artificial intelligence space. These challenges are shaping the future directions of the Network+. The article concludes with a summary of the lessons learned in running this Network+ and introduces our plans for the future in a landscape redrawn by COVID-19, including rebranding into the AI 4 Scientific Discovery Network (www.ai4science.network).

14.
Commun Biol ; 3(1): 736, 2020 12 04.
Artigo em Inglês | MEDLINE | ID: mdl-33277618

RESUMO

Biomedical research often involves conducting experiments on model organisms in the anticipation that the biology learnt will transfer to humans. Previous comparative studies of mouse and human tissues were limited by the use of bulk-cell material. Here we show that transfer learning-the branch of machine learning that concerns passing information from one domain to another-can be used to efficiently map bone marrow biology between species, using data obtained from single-cell RNA sequencing. We first trained a multiclass logistic regression model to recognize different cell types in mouse bone marrow achieving equivalent performance to more complex artificial neural networks. Furthermore, it was able to identify individual human bone marrow cells with 83% overall accuracy. However, some human cell types were not easily identified, indicating important differences in biology. When re-training the mouse classifier using data from human, less than 10 human cells of a given type were needed to accurately learn its representation. In some cases, human cell identities could be inferred directly from the mouse classifier via zero-shot learning. These results show how simple machine learning models can be used to reconstruct complex biology from limited data, with broad implications for biomedical research.


Assuntos
Células da Medula Óssea/classificação , Aprendizado de Máquina , Análise de Sequência de RNA , Análise de Célula Única , Animais , Separação Celular , Humanos , Camundongos
15.
R Soc Open Sci ; 7(2): 190714, 2020 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-32257299

RESUMO

The application of machine learning to inference problems in biology is dominated by supervised learning problems of regression and classification, and unsupervised learning problems of clustering and variants of low-dimensional projections for visualization. A class of problems that have not gained much attention is detecting outliers in datasets, arising from reasons such as gross experimental, reporting or labelling errors. These could also be small parts of a dataset that are functionally distinct from the majority of a population. Outlier data are often identified by considering the probability density of normal data and comparing data likelihoods against some threshold. This classical approach suffers from the curse of dimensionality, which is a serious problem with omics data which are often found in very high dimensions. We develop an outlier detection method based on structured low-rank approximation methods. The objective function includes a regularizer based on neighbourhood information captured in the graph Laplacian. Results on publicly available genomic data show that our method robustly detects outliers whereas a density-based method fails even at moderate dimensions. Moreover, we show that our method has better clustering and visualization performance on the recovered low-dimensional projection when compared with popular dimensionality reduction techniques.

16.
PLoS One ; 14(12): e0226256, 2019.
Artigo em Inglês | MEDLINE | ID: mdl-31834914

RESUMO

Previous work has shown that proteins that have the potential to be vaccine candidates can be predicted from features derived from their amino acid sequences. In this work, we make an empirical comparison across various machine learning classifiers on this sequence-based inference problem. Using systematic cross validation on a dataset of 200 known vaccine candidates and 200 negative examples, with a set of 525 features derived from the AA sequences and feature selection applied through a greedy backward elimination approach, we show that simple classification algorithms often perform as well as more complex support vector kernel machines. The work also includes a novel cross validation applied across bacterial species, i.e. the validation proteins all come from a specific species of bacterium not represented in the training set. We termed this type of validation Leave One Bacteria Out Validation (LOBOV).


Assuntos
Algoritmos , Antígenos de Bactérias/imunologia , Proteínas de Bactérias/imunologia , Vacinas Bacterianas/imunologia , Biologia Computacional/métodos , Vacinologia , Humanos , Aprendizado de Máquina
17.
BMC Bioinformatics ; 20(1): 536, 2019 Oct 29.
Artigo em Inglês | MEDLINE | ID: mdl-31664894

RESUMO

BACKGROUND: Analysis of high-throughput multi-'omics interactions across the hierarchy of expression has wide interest in making inferences with regard to biological function and biomarker discovery. Expression levels across different scales are determined by robust synthesis, regulation and degradation processes, and hence transcript (mRNA) measurements made by microarray/RNA-Seq only show modest correlation with corresponding protein levels. RESULTS: In this work we are interested in quantitative modelling of correlation across such gene products. Building on recent work, we develop computational models spanning transcript, translation and protein levels at different stages of the H. sapiens cell cycle. We enhance this analysis by incorporating 25+ sequence-derived features which are likely determinants of cellular protein concentration and quantitatively select for relevant features, producing a vast dataset with thousands of genes. We reveal insights into the complex interplay between expression levels across time, using machine learning methods to highlight outliers with respect to such models as proteins associated with post-translationally regulated modes of action. CONCLUSIONS: We uncover quantitative separation between modified and degraded proteins that have roles in cell cycle regulation, chromatin remodelling and protein catabolism according to Gene Ontology; and highlight the opportunities for providing biological insights in future model systems.


Assuntos
Divisão Celular , Perfilação da Expressão Gênica/métodos , Genômica , Humanos , Biossíntese de Proteínas , Proteínas/genética , Controle Social Formal
18.
J Cheminform ; 10(1): 54, 2018 Nov 20.
Artigo em Inglês | MEDLINE | ID: mdl-30460426

RESUMO

Topological data analysis is a family of recent mathematical techniques seeking to understand the 'shape' of data, and has been used to understand the structure of the descriptor space produced from a standard chemical informatics software from the point of view of solubility. We have used the mapper algorithm, a TDA method that creates low-dimensional representations of data, to create a network visualization of the solubility space. While descriptors with clear chemical implications are prominent features in this space, reflecting their importance to the chemical properties, an unexpected and interesting correlation between chlorine content and rings and their implication for solubility prediction is revealed. A parallel representation of the chemical space was generated using persistent homology applied to molecular graphs. Links between this chemical space and the descriptor space were shown to be in agreement with chemical heuristics. The use of persistent homology on molecular graphs, extended by the use of norms on the associated persistence landscapes allow the conversion of discrete shape descriptors to continuous ones, and a perspective of the application of these descriptors to quantitative structure property relations is presented.

19.
IEEE Trans Neural Netw Learn Syst ; 29(11): 5475-5485, 2018 11.
Artigo em Inglês | MEDLINE | ID: mdl-29993618

RESUMO

In this paper, we propose a novel approach for efficient training of deep neural networks in a bottom-up fashion using a layered structure. Our algorithm, which we refer to as deep cascade learning, is motivated by the cascade correlation approach of Fahlman and Lebiere, who introduced it in the context of perceptrons. We demonstrate our algorithm on networks of convolutional layers, though its applicability is more general. Such training of deep networks in a cascade directly circumvents the well-known vanishing gradient problem by ensuring that the output is always adjacent to the layer being trained. We present empirical evaluations comparing our deep cascade training with standard end-end training using back propagation of two convolutional neural network architectures on benchmark image classification tasks (CIFAR-10 and CIFAR-100). We then investigate the features learned by the approach and find that better, domain-specific, representations are learned in early layers when compared to what is learned in end-end training. This is partially attributable to the vanishing gradient problem that inhibits early layer filters to change significantly from their initial settings. While both networks perform similarly overall, recognition accuracy increases progressively with each added layer, with discriminative features learned in every stage of the network, whereas in end-end training, no such systematic feature representation was observed. We also show that such cascade training has significant computational and memory advantages over end-end training, and can be used as a pretraining algorithm to obtain a better performance.

20.
J Immunol ; 201(1): 251-263, 2018 07 01.
Artigo em Inglês | MEDLINE | ID: mdl-29769273

RESUMO

MicroRNAs are small noncoding RNAs that inhibit gene expression posttranscriptionally, implicated in virtually all biological processes. Although the effect of individual microRNAs is generally studied, the genome-wide role of multiple microRNAs is less investigated. We assessed paired genome-wide expression of microRNAs with total (cytoplasmic) and translational (polyribosome-bound) mRNA levels employing subcellular fractionation and RNA sequencing (Frac-seq) in human primary bronchoepithelium from healthy controls and severe asthmatics. Severe asthma is a chronic inflammatory disease of the airways characterized by poor response to therapy. We found genes (i.e., isoforms of a gene) and mRNA isoforms differentially expressed in asthma, with novel inflammatory and structural pathophysiological mechanisms related to bronchoepithelium disclosed solely by polyribosome-bound mRNAs (e.g., IL1A and LTB genes or ITGA6 and ITGA2 alternatively spliced isoforms). Gene expression (i.e., isoforms of a gene) and mRNA expression analysis revealed different molecular candidates and biological pathways, with differentially expressed polyribosome-bound and total mRNAs also showing little overlap. We reveal a hub of six dysregulated microRNAs accounting for ∼90% of all microRNA targeting, displaying preference for polyribosome-bound mRNAs. Transfection of this hub in bronchial epithelial cells from healthy donors mimicked asthma characteristics. Our work demonstrates extensive posttranscriptional gene dysregulation in human asthma, in which microRNAs play a central role, illustrating the feasibility and importance of assessing posttranscriptional gene expression when investigating human disease.


Assuntos
Asma/genética , Células Epiteliais/metabolismo , Regulação da Expressão Gênica/genética , MicroRNAs/genética , Isoformas de RNA/genética , Mucosa Respiratória/citologia , Adolescente , Adulto , Idoso , Processamento Alternativo/genética , Sequência de Bases , Feminino , Humanos , Masculino , Pessoa de Meia-Idade , RNA Mensageiro/genética , Análise de Sequência de RNA , Inquéritos e Questionários , Adulto Jovem
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA