Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 237
Filtrar
Mais filtros

Base de dados
Tipo de documento
Intervalo de ano de publicação
1.
Nat Immunol ; 22(1): 74-85, 2021 01.
Artigo em Inglês | MEDLINE | ID: mdl-32999467

RESUMO

T cell immunity is central for the control of viral infections. To characterize T cell immunity, but also for the development of vaccines, identification of exact viral T cell epitopes is fundamental. Here we identify and characterize multiple dominant and subdominant SARS-CoV-2 HLA class I and HLA-DR peptides as potential T cell epitopes in COVID-19 convalescent and unexposed individuals. SARS-CoV-2-specific peptides enabled detection of post-infectious T cell immunity, even in seronegative convalescent individuals. Cross-reactive SARS-CoV-2 peptides revealed pre-existing T cell responses in 81% of unexposed individuals and validated similarity with common cold coronaviruses, providing a functional basis for heterologous immunity in SARS-CoV-2 infection. Diversity of SARS-CoV-2 T cell responses was associated with mild symptoms of COVID-19, providing evidence that immunity requires recognition of multiple epitopes. Together, the proposed SARS-CoV-2 T cell epitopes enable identification of heterologous and post-infectious T cell immunity and facilitate development of diagnostic, preventive and therapeutic measures for COVID-19.


Assuntos
COVID-19/imunologia , Epitopos de Linfócito T/imunologia , Peptídeos/imunologia , SARS-CoV-2/imunologia , Linfócitos T/imunologia , Vacinas Virais/imunologia , COVID-19/prevenção & controle , COVID-19/virologia , Reações Cruzadas/imunologia , Antígenos HLA-DR/imunologia , Antígenos HLA-DR/metabolismo , Antígenos de Histocompatibilidade Classe I/imunologia , Antígenos de Histocompatibilidade Classe I/metabolismo , Humanos , Memória Imunológica/imunologia , SARS-CoV-2/fisiologia , Linfócitos T/metabolismo , Vacinas Virais/administração & dosagem
2.
Nature ; 630(8018): 912-919, 2024 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-38867041

RESUMO

The ancient city of Chichén Itzá in Yucatán, Mexico, was one of the largest and most influential Maya settlements during the Late and Terminal Classic periods (AD 600-1000) and it remains one of the most intensively studied archaeological sites in Mesoamerica1-4. However, many questions about the social and cultural use of its ceremonial spaces, as well as its population's genetic ties to other Mesoamerican groups, remain unanswered2. Here we present genome-wide data obtained from 64 subadult individuals dating to around AD 500-900 that were found in a subterranean mass burial near the Sacred Cenote (sinkhole) in the ceremonial centre of Chichén Itzá. Genetic analyses showed that all analysed individuals were male and several individuals were closely related, including two pairs of monozygotic twins. Twins feature prominently in Mayan and broader Mesoamerican mythology, where they embody qualities of duality among deities and heroes5, but until now they had not been identified in ancient Mayan mortuary contexts. Genetic comparison to present-day people in the region shows genetic continuity with the ancient inhabitants of Chichén Itzá, except at certain genetic loci related to human immunity, including the human leukocyte antigen complex, suggesting signals of adaptation due to infectious diseases introduced to the region during the colonial period.


Assuntos
Comportamento Ritualístico , DNA Antigo , Genoma Humano , Humanos , México , Genoma Humano/genética , Masculino , DNA Antigo/análise , História Antiga , Feminino , Sepultamento/história , Arqueologia , Gêmeos/genética , História Medieval
3.
Cell ; 153(5): 1149-63, 2013 May 23.
Artigo em Inglês | MEDLINE | ID: mdl-23664763

RESUMO

Differentiation of human embryonic stem cells (hESCs) provides a unique opportunity to study the regulatory mechanisms that facilitate cellular transitions in a human context. To that end, we performed comprehensive transcriptional and epigenetic profiling of populations derived through directed differentiation of hESCs representing each of the three embryonic germ layers. Integration of whole-genome bisulfite sequencing, chromatin immunoprecipitation sequencing, and RNA sequencing reveals unique events associated with specification toward each lineage. Lineage-specific dynamic alterations in DNA methylation and H3K4me1 are evident at putative distal regulatory elements that are frequently bound by pluripotency factors in the undifferentiated hESCs. In addition, we identified germ-layer-specific H3K27me3 enrichment at sites exhibiting high DNA methylation in the undifferentiated state. A better understanding of these initial specification events will facilitate identification of deficiencies in current approaches, leading to more faithful differentiation strategies as well as providing insights into the rewiring of human regulatory programs during cellular transitions.


Assuntos
Células-Tronco Embrionárias/metabolismo , Epigênese Genética , Transcrição Gênica , Acetilação , Diferenciação Celular , Cromatina/química , Cromatina/metabolismo , Metilação de DNA , Elementos Facilitadores Genéticos , Histonas/metabolismo , Humanos , Metilação
4.
Nat Methods ; 2024 Jul 04.
Artigo em Inglês | MEDLINE | ID: mdl-38965444

RESUMO

The volume of public proteomics data is rapidly increasing, causing a computational challenge for large-scale reanalysis. Here, we introduce quantms ( https://quant,ms.org/ ), an open-source cloud-based pipeline for massively parallel proteomics data analysis. We used quantms to reanalyze 83 public ProteomeXchange datasets, comprising 29,354 instrument files from 13,132 human samples, to quantify 16,599 proteins based on 1.03 million unique peptides. quantms is based on standard file formats improving the reproducibility, submission and dissemination of the data to ProteomeXchange.

5.
Bioinformatics ; 40(4)2024 Mar 29.
Artigo em Inglês | MEDLINE | ID: mdl-38498849

RESUMO

MOTIVATION: Cross-linking mass spectrometry has made remarkable advancements in the high-throughput characterization of protein structures and interactions. The resulting pairs of cross-linked peptides typically require geometric assessment and validation, given the availability of their corresponding structures. RESULTS: CLAUDIO (Cross-linking Analysis Using Distances and Overlaps) is an open-source software tool designed for the automated analysis and validation of different varieties of large-scale cross-linking experiments. Many of the otherwise manual processes for structural validation (i.e. structure retrieval and mapping) are performed fully automatically to simplify and accelerate the data interpretation process. In addition, CLAUDIO has the ability to remap intra-protein links as inter-protein links and discover evidence for homo-multimers. AVAILABILITY AND IMPLEMENTATION: CLAUDIO is available as open-source software under the MIT license at https://github.com/KohlbacherLab/CLAUDIO.


Assuntos
Peptídeos , Software , Peptídeos/química , Espectrometria de Massas , Reagentes de Ligações Cruzadas/química
6.
Proteomics ; 24(3-4): e2300068, 2024 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-37997224

RESUMO

Top-down proteomics (TDP) directly analyzes intact proteins and thus provides more comprehensive qualitative and quantitative proteoform-level information than conventional bottom-up proteomics (BUP) that relies on digested peptides and protein inference. While significant advancements have been made in TDP in sample preparation, separation, instrumentation, and data analysis, reliable and reproducible data analysis still remains one of the major bottlenecks in TDP. A key step for robust data analysis is the establishment of an objective estimation of proteoform-level false discovery rate (FDR) in proteoform identification. The most widely used FDR estimation scheme is based on the target-decoy approach (TDA), which has primarily been established for BUP. We present evidence that the TDA-based FDR estimation may not work at the proteoform-level due to an overlooked factor, namely the erroneous deconvolution of precursor masses, which leads to incorrect FDR estimation. We argue that the conventional TDA-based FDR in proteoform identification is in fact protein-level FDR rather than proteoform-level FDR unless precursor deconvolution error rate is taken into account. To address this issue, we propose a formula to correct for proteoform-level FDR bias by combining TDA-based FDR and precursor deconvolution error rate.


Assuntos
Peptídeos , Proteômica , Proteínas de Ligação a DNA
7.
Proteomics ; 24(8): e2300144, 2024 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-38629965

RESUMO

In protein-RNA cross-linking mass spectrometry, UV or chemical cross-linking introduces stable bonds between amino acids and nucleic acids in protein-RNA complexes that are then analyzed and detected in mass spectra. This analytical tool delivers valuable information about RNA-protein interactions and RNA docking sites in proteins, both in vitro and in vivo. The identification of cross-linked peptides with oligonucleotides of different length leads to a combinatorial increase in search space. We demonstrate that the peptide retention time prediction tasks can be transferred to the task of cross-linked peptide retention time prediction using a simple amino acid composition encoding, yielding improved identification rates when the prediction error is included in rescoring. For the more challenging task of including fragment intensity prediction of cross-linked peptides in the rescoring, we obtain, on average, a similar improvement. Further improvement in the encoding and fine-tuning of retention time and intensity prediction models might lead to further gains, and merit further research.


Assuntos
Ácidos Nucleicos , RNA , Aminoácidos , Espectrometria de Massas , Peptídeos
8.
Artigo em Alemão | MEDLINE | ID: mdl-38684526

RESUMO

Healthcare data are an important resource in applied medical research. They are available multicentrically. However, it remains a challenge to enable standardized data exchange processes between federal states and their individual laws and regulations. The Medical Informatics Initiative (MII) was founded in 2016 to implement processes that enable cross-clinic access to healthcare data in Germany. Several working groups (WGs) have been set up to coordinate standardized data structures (WG Interoperability), patient information and declarations of consent (WG Consent), and regulations on data exchange (WG Data Sharing). Here we present the most important results of the Data Sharing working group, which include agreed terms of use, legal regulations, and data access processes. They are already being implemented by the established Data Integration Centers (DIZ) and Use and Access Committees (UACs). We describe the services that are necessary to provide researchers with standardized data access. They are implemented with the Research Data Portal for Health, among others. Since the pilot phase, the processes of 385 active researchers have been used on this basis, which, as of April 2024, has resulted in 19 registered projects and 31 submitted research applications.


Assuntos
Registros Eletrônicos de Saúde , Disseminação de Informação , Humanos , Pesquisa Biomédica , Registros Eletrônicos de Saúde/estatística & dados numéricos , Alemanha , Pesquisa sobre Serviços de Saúde , Informática Médica , Registro Médico Coordenado/métodos , Modelos Organizacionais
9.
BMC Bioinformatics ; 24(1): 88, 2023 Mar 08.
Artigo em Inglês | MEDLINE | ID: mdl-36890446

RESUMO

BACKGROUND: Personalized oncology represents a shift in cancer treatment from conventional methods to target specific therapies where the decisions are made based on the patient specific tumor profile. Selection of the optimal therapy relies on a complex interdisciplinary analysis and interpretation of these variants by experts in molecular tumor boards. With up to hundreds of somatic variants identified in a tumor, this process requires visual analytics tools to guide and accelerate the annotation process. RESULTS: The Personal Cancer Network Explorer (PeCaX) is a visual analytics tool supporting the efficient annotation, navigation, and interpretation of somatic genomic variants through functional annotation, drug target annotation, and visual interpretation within the context of biological networks. Starting with somatic variants in a VCF file, PeCaX enables users to explore these variants through a web-based graphical user interface. The most protruding feature of PeCaX is the combination of clinical variant annotation and gene-drug networks with an interactive visualization. This reduces the time and effort the user needs to invest to get to a treatment suggestion and helps to generate new hypotheses. PeCaX is being provided as a platform-independent containerized software package for local or institution-wide deployment. PeCaX is available for download at https://github.com/KohlbacherLab/PeCaX-docker .


Assuntos
Neoplasias , Software , Humanos , Genômica/métodos , Neoplasias/genética , Oncologia
10.
Mol Biol Evol ; 39(6)2022 06 02.
Artigo em Inglês | MEDLINE | ID: mdl-35578825

RESUMO

Human expansion in the course of the Neolithic transition in western Eurasia has been one of the major topics in ancient DNA research in the last 10 years. Multiple studies have shown that the spread of agriculture and animal husbandry from the Near East across Europe was accompanied by large-scale human expansions. Moreover, changes in subsistence and migration associated with the Neolithic transition have been hypothesized to involve genetic adaptation. Here, we present high quality genome-wide data from the Linear Pottery Culture site Derenburg-Meerenstieg II (DER) (N = 32 individuals) in Central Germany. Population genetic analyses show that the DER individuals carried predominantly Anatolian Neolithic-like ancestry and a very limited degree of local hunter-gatherer admixture, similar to other early European farmers. Increasing the Linear Pottery culture cohort size to ∼100 individuals allowed us to perform various frequency- and haplotype-based analyses to investigate signatures of selection associated with changes following the adoption of the Neolithic lifestyle. In addition, we developed a new method called Admixture-informed Maximum-likelihood Estimation for Selection Scans that allowed us test for selection signatures in an admixture-aware fashion. Focusing on the intersection of results from these selection scans, we identified various loci associated with immune function (JAK1, HLA-DQB1) and metabolism (LMF1, LEPR, SORBS1), as well as skin color (SLC24A5, CD82) and folate synthesis (MTHFR, NBPF3). Our findings shed light on the evolutionary pressures, such as infectious disease and changing diet, that were faced by the early farmers of Western Eurasia.


Assuntos
Fazendeiros , Migração Humana , Agricultura , DNA Antigo , DNA Mitocondrial/genética , Europa (Continente) , Genética Populacional , História Antiga , Humanos
11.
Br J Cancer ; 128(9): 1777-1787, 2023 05.
Artigo em Inglês | MEDLINE | ID: mdl-36823366

RESUMO

BACKGROUND: The immune peptidome of OPSCC has not previously been studied. Cancer-antigen specific vaccination may improve clinical outcome and efficacy of immune checkpoint inhibitors such as PD1/PD-L1 antibodies. METHODS: Mapping of the OPSCC HLA ligandome was performed by mass spectrometry (MS) based analysis of naturally presented HLA ligands isolated from tumour tissue samples (n = 40) using immunoaffinity purification. The cohort included 22 HPV-positive (primarily HPV-16) and 18 HPV-negative samples. A benign reference dataset comprised of the HLA ligandomes of benign haematological and tissue datasets was used to identify tumour-associated antigens. RESULTS: MS analysis led to the identification of naturally HLA-presented peptides in OPSCC tumour tissue. In total, 22,769 peptides from 9485 source proteins were detected on HLA class I. For HLA class II, 15,203 peptides from 4634 source proteins were discovered. By comparative profiling against the benign HLA ligandomic datasets, 29 OPSCC-associated HLA class I ligands covering 11 different HLA allotypes and nine HLA class II ligands were selected to create a peptide warehouse. CONCLUSION: Tumour-associated peptides are HLA-presented on the cell surfaces of OPSCCs. The established warehouse of OPSCC-associated peptides can be used for downstream immunogenicity testing and peptide-based immunotherapy in (semi)personalised strategies.


Assuntos
Antígenos HLA , Neoplasias Otorrinolaringológicas , Infecções por Papillomavirus , Carcinoma de Células Escamosas de Cabeça e Pescoço , Humanos , Infecções por Papillomavirus/imunologia , Peptídeos/imunologia , Vacinação , Neoplasias Otorrinolaringológicas/imunologia , Antígenos HLA/imunologia , Antígenos de Neoplasias/imunologia , Papillomavirus Humano 16 , Papillomavirus Humano 18
12.
Bioinformatics ; 38(8): 2202-2210, 2022 04 12.
Artigo em Inglês | MEDLINE | ID: mdl-35150254

RESUMO

MOTIVATION: Diagnosis and treatment decisions on genomic data have become widespread as the cost of genome sequencing decreases gradually. In this context, disease-gene association studies are of great importance. However, genomic data are very sensitive when compared to other data types and contains information about individuals and their relatives. Many studies have shown that this information can be obtained from the query-response pairs on genomic databases. In this work, we propose a method that uses secure multi-party computation to query genomic databases in a privacy-protected manner. The proposed solution privately outsources genomic data from arbitrarily many sources to the two non-colluding proxies and allows genomic databases to be safely stored in semi-honest cloud environments. It provides data privacy, query privacy and output privacy by using XOR-based sharing and unlike previous solutions, it allows queries to run efficiently on hundreds of thousands of genomic data. RESULTS: We measure the performance of our solution with parameters similar to real-world applications. It is possible to query a genomic database with 3 000 000 variants with five genomic query predicates under 400 ms. Querying 1 048 576 genomes, each containing 1 000 000 variants, for the presence of five different query variants can be achieved approximately in 6 min with a small amount of dedicated hardware and connectivity. These execution times are in the right range to enable real-world applications in medical research and healthcare. Unlike previous studies, it is possible to query multiple databases with response times fast enough for practical application. To the best of our knowledge, this is the first solution that provides this performance for querying large-scale genomic data. AVAILABILITY AND IMPLEMENTATION: https://gitlab.com/DIFUTURE/privacy-preserving-variant-queries. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Segurança Computacional , Privacidade , Humanos , Genômica , Bases de Dados Factuais
13.
BMC Neurol ; 23(1): 2, 2023 Jan 04.
Artigo em Inglês | MEDLINE | ID: mdl-36597038

RESUMO

BACKGROUND: Although of high individual and socioeconomic relevance, a reliable prediction model for the prognosis of juvenile stroke (18-55 years) is missing. Therefore, the study presented in this protocol aims to prospectively validate the discriminatory power of a prediction score for the 3 months functional outcome after juvenile stroke or transient ischemic attack (TIA) that has been derived from an independent retrospective study using standard clinical workup data. METHODS: PREDICT-Juvenile-Stroke is a multi-centre (n = 4) prospective observational cohort study collecting standard clinical workup data and data on treatment success at 3 months after acute ischemic stroke or TIA that aims to validate a new prediction score for juvenile stroke. The prediction score has been developed upon single center retrospective analysis of 340 juvenile stroke patients. The score determines the patient's individual probability for treatment success defined by a modified Rankin Scale (mRS) 0-2 or return to pre-stroke baseline mRS 3 months after stroke or TIA. This probability will be compared to the observed clinical outcome at 3 months using the area under the receiver operating characteristic curve. The primary endpoint is to validate the clinical potential of the new prediction score for a favourable outcome 3 months after juvenile stroke or TIA. Secondary outcomes are to determine to what extent predictive factors in juvenile stroke or TIA patients differ from those in older patients and to determine the predictive accuracy of the juvenile stroke prediction score on other clinical and paraclinical endpoints. A minimum of 430 juvenile patients (< 55 years) with acute ischemic stroke or TIA, and the same number of older patients will be enrolled for the prospective validation study. DISCUSSION: The juvenile stroke prediction score has the potential to enable personalisation of counselling, provision of appropriate information regarding the prognosis and identification of patients who benefit from specific treatments. TRIAL REGISTRATION: The study has been registered at https://drks.de on March 31, 2022 ( DRKS00024407 ).


Assuntos
Ataque Isquêmico Transitório , AVC Isquêmico , Acidente Vascular Cerebral , Humanos , Adulto Jovem , Idoso , Ataque Isquêmico Transitório/diagnóstico , Ataque Isquêmico Transitório/epidemiologia , Ataque Isquêmico Transitório/complicações , AVC Isquêmico/complicações , Estudos Retrospectivos , Acidente Vascular Cerebral/diagnóstico , Acidente Vascular Cerebral/epidemiologia , Acidente Vascular Cerebral/complicações , Prognóstico , Valor Preditivo dos Testes , Estudos Observacionais como Assunto
14.
Mol Cell Proteomics ; 20: 100071, 2021.
Artigo em Inglês | MEDLINE | ID: mdl-33711481

RESUMO

Today it is the norm that all relevant proteomics data that support the conclusions in scientific publications are made available in public proteomics data repositories. However, given the increase in the number of clinical proteomics studies, an important emerging topic is the management and dissemination of clinical, and thus potentially sensitive, human proteomics data. Both in the United States and in the European Union, there are legal frameworks protecting the privacy of individuals. Implementing privacy standards for publicly released research data in genomics and transcriptomics has led to processes to control who may access the data, so-called "controlled access" data. In parallel with the technological developments in the field, it is clear that the privacy risks of sharing proteomics data need to be properly assessed and managed. In our view, the proteomics community must be proactive in addressing these issues. Yet a careful balance must be kept. On the one hand, neglecting to address the potential of identifiability in human proteomics data could lead to reputational damage of the field, while on the other hand, erecting barriers to open access to clinical proteomics data will inevitably reduce reuse of proteomics data and could substantially delay critical discoveries in biomedical research. In order to balance these apparently conflicting requirements for data privacy and efficient use and reuse of research efforts through the sharing of clinical proteomics data, development efforts will be needed at different levels including bioinformatics infrastructure, policymaking, and mechanisms of oversight.


Assuntos
Gerenciamento de Dados , Proteômica , Confidencialidade , Humanos , Disseminação de Informação
15.
Proc Natl Acad Sci U S A ; 117(1): 454-463, 2020 01 07.
Artigo em Inglês | MEDLINE | ID: mdl-31871210

RESUMO

Liver fibrosis interferes with normal liver function and facilitates hepatocellular carcinoma (HCC) development, representing a major threat to human health. Here, we present a comprehensive perspective of microRNA (miRNA) function on targeting the fibrotic microenvironment. Starting from a murine HCC model, we identify a miRNA network composed of 8 miRNA hubs and 54 target genes. We show that let-7, miR-30, miR-29c, miR-335, and miR-338 (collectively termed antifibrotic microRNAs [AF-miRNAs]) down-regulate key structural, signaling, and remodeling components of the extracellular matrix. During fibrogenic transition, these miRNAs are transcriptionally regulated by the transcription factor Pparγ and thus we identify a role of Pparγ as regulator of a functionally related class of AF-miRNAs. The miRNA network is active in human HCC, breast, and lung carcinomas, as well as in 2 independent mouse liver fibrosis models. Therefore, we identify a miRNA:mRNA network that contributes to formation of fibrosis in tumorous and nontumorous organs of mice and humans.


Assuntos
Carcinoma Hepatocelular/genética , Regulação Neoplásica da Expressão Gênica , Cirrose Hepática/patologia , Neoplasias Hepáticas/genética , MicroRNAs/genética , PPAR gama/metabolismo , Animais , Neoplasias da Mama/genética , Neoplasias da Mama/patologia , Carcinoma Hepatocelular/patologia , Ilhas de CpG/genética , Metilação de DNA , Conjuntos de Dados como Assunto , Modelos Animais de Doenças , Epigênese Genética , Matriz Extracelular/patologia , Feminino , Células Estreladas do Fígado/patologia , Humanos , Fígado/citologia , Fígado/patologia , Neoplasias Hepáticas/patologia , Neoplasias Pulmonares/genética , Neoplasias Pulmonares/patologia , Camundongos , Cultura Primária de Células , Regiões Promotoras Genéticas/genética , RNA-Seq , Microambiente Tumoral/genética
16.
BMC Bioinformatics ; 23(1): 139, 2022 Apr 19.
Artigo em Inglês | MEDLINE | ID: mdl-35439941

RESUMO

BACKGROUND: With a growing amount of (multi-)omics data being available, the extraction of knowledge from these datasets is still a difficult problem. Classical enrichment-style analyses require predefined pathways or gene sets that are tested for significant deregulation to assess whether the pathway is functionally involved in the biological process under study. De novo identification of these pathways can reduce the bias inherent in predefined pathways or gene sets. At the same time, the definition and efficient identification of these pathways de novo from large biological networks is a challenging problem. RESULTS: We present a novel algorithm, DeRegNet, for the identification of maximally deregulated subnetworks on directed graphs based on deregulation scores derived from (multi-)omics data. DeRegNet can be interpreted as maximum likelihood estimation given a certain probabilistic model for de-novo subgraph identification. We use fractional integer programming to solve the resulting combinatorial optimization problem. We can show that the approach outperforms related algorithms on simulated data with known ground truths. On a publicly available liver cancer dataset we can show that DeRegNet can identify biologically meaningful subgraphs suitable for patient stratification. DeRegNet can also be used to find explicitly multi-omics subgraphs which we demonstrate by presenting subgraphs with consistent methylation-transcription patterns. DeRegNet is freely available as open-source software. CONCLUSION: The proposed algorithmic framework and its available implementation can serve as a valuable heuristic hypothesis generation tool contextualizing omics data within biomolecular networks.


Assuntos
Algoritmos , Software , Viés , Humanos , Modelos Estatísticos
17.
J Proteome Res ; 21(4): 1204-1207, 2022 04 01.
Artigo em Inglês | MEDLINE | ID: mdl-35119864

RESUMO

Machine learning is increasingly applied in proteomics and metabolomics to predict molecular structure, function, and physicochemical properties, including behavior in chromatography, ion mobility, and tandem mass spectrometry. These must be described in sufficient detail to apply or evaluate the performance of trained models. Here we look at and interpret the recently published and general DOME (Data, Optimization, Model, Evaluation) recommendations for conducting and reporting on machine learning in the specific context of proteomics and metabolomics.


Assuntos
Metabolômica , Proteômica , Aprendizado de Máquina , Metabolômica/métodos , Proteômica/métodos , Espectrometria de Massas em Tandem
18.
Mol Biol Evol ; 38(10): 4059-4076, 2021 09 27.
Artigo em Inglês | MEDLINE | ID: mdl-34002224

RESUMO

Pathogens and associated outbreaks of infectious disease exert selective pressure on human populations, and any changes in allele frequencies that result may be especially evident for genes involved in immunity. In this regard, the 1346-1353 Yersinia pestis-caused Black Death pandemic, with continued plague outbreaks spanning several hundred years, is one of the most devastating recorded in human history. To investigate the potential impact of Y. pestis on human immunity genes, we extracted DNA from 36 plague victims buried in a mass grave in Ellwangen, Germany in the 16th century. We targeted 488 immune-related genes, including HLA, using a novel in-solution hybridization capture approach. In comparison with 50 modern native inhabitants of Ellwangen, we find differences in allele frequencies for variants of the innate immunity proteins Ficolin-2 and NLRP14 at sites involved in determining specificity. We also observed that HLA-DRB1*13 is more than twice as frequent in the modern population, whereas HLA-B alleles encoding an isoleucine at position 80 (I-80+), HLA C*06:02 and HLA-DPB1 alleles encoding histidine at position 9 are half as frequent in the modern population. Simulations show that natural selection has likely driven these allele frequency changes. Thus, our data suggest that allele frequencies of HLA genes involved in innate and adaptive immunity responsible for extracellular and intracellular responses to pathogenic bacteria, such as Y. pestis, could have been affected by the historical epidemics that occurred in Europe.


Assuntos
Peste , Yersinia pestis , DNA , Genômica , Humanos , Pandemias/história , Peste/genética , Yersinia pestis/genética
20.
Bioinformatics ; 36(21): 5205-5213, 2021 01 29.
Artigo em Inglês | MEDLINE | ID: mdl-32683440

RESUMO

MOTIVATION: The use of genome data for diagnosis and treatment is becoming increasingly common. Researchers need access to as many genomes as possible to interpret the patient genome, to obtain some statistical patterns and to reveal disease-gene relationships. The sensitive information contained in the genome data and the high risk of re-identification increase the privacy and security concerns associated with sharing such data. In this article, we present an approach to identify disease-associated variants and genes while ensuring patient privacy. The proposed method uses secure multi-party computation to find disease-causing mutations under specific inheritance models without sacrificing the privacy of individuals. It discloses only variants or genes obtained as a result of the analysis. Thus, the vast majority of patient data can be kept private. RESULTS: Our prototype implementation performs analyses on thousands of genomic data in milliseconds, and the runtime scales logarithmically with the number of patients. We present the first inheritance model (recessive, dominant and compound heterozygous) based privacy-preserving analyses of genomic data to find disease-causing mutations. Furthermore, we re-implement the privacy-preserving methods (MAX, SETDIFF and INTERSECTION) proposed in a previous study. Our MAX, SETDIFF and INTERSECTION implementations are 2.5, 1122 and 341 times faster than the corresponding operations of the state-of-the-art protocol, respectively. AVAILABILITY AND IMPLEMENTATION: https://gitlab.com/DIFUTURE/privacy-preserving-genomic-diagnosis. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Genômica , Privacidade , Confidencialidade , Estudo de Associação Genômica Ampla , Humanos , Mutação
SELEÇÃO DE REFERÊNCIAS
Detalhe da pesquisa