RESUMEN
BACKGROUND: The phenotype of a living cell is determined by its pattern of active signaling networks, giving rise to a "molecular phenotype" associated with differential gene expression. Digital amplicon based RNA quantification by sequencing is a useful technology for molecular phenotyping as a novel tool to characterize the state of biological systems. RESULTS: We show here that the activity of signaling networks can be assessed based on a set of established key regulators and expression targets rather than the entire transcriptome. We compiled a panel of 917 human pathway reporter genes, representing 154 human signaling and metabolic networks for integrated knowledge- and data-driven understanding of biological processes. The reporter genes are significantly enriched for regulators and effectors covering a wide range of biological processes, and faithfully capture gene-level and pathway-level changes. We apply the approach to iPSC derived cardiomyocytes and primary human hepatocytes to describe changes in molecular phenotype during development or drug response. The reporter genes deliver an accurate pathway-centric view of the biological system under study, and identify known and novel modulation of signaling networks consistent with literature or experimental data. CONCLUSIONS: A panel of 917 pathway reporter genes is sufficient to describe changes in the molecular phenotype defined by 154 signaling cascades in various human cell types. AmpliSeq-RNA based digital transcript imaging enables simultaneous monitoring of the entire pathway reporter gene panel in up to 150 samples. We propose molecular phenotyping as a useful approach to understand diseases and drug action at the network level.
Asunto(s)
Algoritmos , Genes Reporteros/genética , Redes y Vías Metabólicas/genética , Transducción de Señal/genética , Antiinflamatorios no Esteroideos/toxicidad , Diferenciación Celular , Diclofenaco/toxicidad , Hepatocitos/citología , Hepatocitos/efectos de los fármacos , Hepatocitos/metabolismo , Humanos , Células Madre Pluripotentes Inducidas/citología , Células Madre Pluripotentes Inducidas/metabolismo , Miocitos Cardíacos/citología , Miocitos Cardíacos/metabolismo , Fenotipo , Análisis de Componente PrincipalRESUMEN
BACKGROUND: In the past decade the Göttingen minipig has gained increasing recognition as animal model in pharmaceutical and safety research because it recapitulates many aspects of human physiology and metabolism. Genome-based comparison of drug targets together with quantitative tissue expression analysis allows rational prediction of pharmacology and cross-reactivity of human drugs in animal models thereby improving drug attrition which is an important challenge in the process of drug development. RESULTS: Here we present a new chromosome level based version of the Göttingen minipig genome together with a comparative transcriptional analysis of tissues with pharmaceutical relevance as basis for translational research. We relied on mapping and assembly of WGS (whole-genome-shotgun sequencing) derived reads to the reference genome of the Duroc pig and predict 19,228 human orthologous protein-coding genes. Genome-based prediction of the sequence of human drug targets enables the prediction of drug cross-reactivity based on conservation of binding sites. We further support the finding that the genome of Sus scrofa contains about ten-times less pseudogenized genes compared to other vertebrates. Among the functional human orthologs of these minipig pseudogenes we found HEPN1, a putative tumor suppressor gene. The genomes of Sus scrofa, the Tibetan boar, the African Bushpig, and the Warthog show sequence conservation of all inactivating HEPN1 mutations suggesting disruption before the evolutionary split of these pig species. We identify 133 Sus scrofa specific, conserved long non-coding RNAs (lncRNAs) in the minipig genome and show that these transcripts are highly conserved in the African pigs and the Tibetan boar suggesting functional significance. Using a new minipig specific microarray we show high conservation of gene expression signatures in 13 tissues with biomedical relevance between humans and adult minipigs. We underline this relationship for minipig and human liver where we could demonstrate similar expression levels for most phase I drug-metabolizing enzymes. Higher expression levels and metabolic activities were found for FMO1, AKR/CRs and for phase II drug metabolizing enzymes in minipig as compared to human. The variability of gene expression in equivalent human and minipig tissues is considerably higher in minipig organs, which is important for study design in case a human target belongs to this variable category in the minipig. The first analysis of gene expression in multiple tissues during development from young to adult shows that the majority of transcriptional programs are concluded four weeks after birth. This finding is in line with the advanced state of human postnatal organ development at comparative age categories and further supports the minipig as model for pediatric drug safety studies. CONCLUSIONS: Genome based assessment of sequence conservation combined with gene expression data in several tissues improves the translational value of the minipig for human drug development. The genome and gene expression data presented here are important resources for researchers using the minipig as model for biomedical research or commercial breeding. Potential impact of our data for comparative genomics, translational research, and experimental medicine are discussed.
Asunto(s)
Genoma , Porcinos Enanos/genética , Envejecimiento/genética , Animales , Cromosomas , Expresión Génica , Perfilación de la Expresión Génica , Humanos , Hígado/metabolismo , Preparaciones Farmacéuticas/metabolismo , Seudogenes , Especificidad de la Especie , Porcinos , Transcripción GenéticaRESUMEN
BACKGROUND: In clinical and basic research custom panels for transcript profiling are gaining importance because only project specific informative genes are interrogated. This approach reduces costs and complexity of data analysis and allows multiplexing of samples. Polymerase-chain-reaction (PCR) based TaqMan assays have high sensitivity but suffer from a limited dynamic range and sample throughput. Hence, there is a gap for a technology able to measure expression of large gene sets in multiple samples. RESULTS: We have adapted a commercially available mRNA quantification assay (AmpliSeq-RNA) that measures mRNA abundance based on the frequency of PCR amplicons determined by high-throughput semiconductor sequencing. This approach allows for parallel, accurate quantification of about 1000 transcripts in multiple samples covering a dynamic range of five orders of magnitude. Using samples derived from a well-characterized stem cell differentiation model, we obtained a good correlation (r = 0.78) of transcript levels measured by AmpliSeq-RNA and DNA-microarrays. A significant portion of low abundant transcripts escapes detection by microarrays due to limited sensitivity. Standard quantitative RNA sequencing of the same samples confirms expression of low abundant genes with an overall correlation coefficient of r = 0.87. Based on digital AmpliSeq-RNA imaging we show switches of signaling cascades at four time points during differentiation of stem cells into cardiomyocytes. CONCLUSIONS: The AmpliSeq-RNA technology adapted to high-throughput semiconductor sequencing allows robust transcript quantification based on amplicon frequency. Multiplexing of at least 900 parallel PCR reactions is feasible because sequencing-based quantification eliminates artefacts coming from off-target amplification. Using this approach, RNA quantification and detection of genetic variations can be performed in the same experiment.
Asunto(s)
ARN Mensajero/genética , Análisis de Secuencia de ARN , Mapeo Contig , Perfilación de la Expresión Génica , Humanos , Análisis de Secuencia por Matrices de Oligonucleótidos , ARN Mensajero/metabolismo , Reacción en Cadena en Tiempo Real de la Polimerasa , Sensibilidad y Especificidad , TranscriptomaRESUMEN
The long-tailed macaque, also referred to as cynomolgus monkey (Macaca fascicularis), is one of the most important nonhuman primate animal models in basic and applied biomedical research. To improve the predictive power of primate experiments for humans, we determined the genome sequence of a Macaca fascicularis female of Mauritian origin using a whole-genome shotgun sequencing approach. We applied a template switch strategy that uses either the rhesus or the human genome to assemble sequence reads. The sixfold sequence coverage of the draft genome sequence enabled discovery of about 2.1 million potential single-nucleotide polymorphisms based on occurrence of a dimorphic nucleotide at a given position in the genome sequence. Homology-based annotation allowed us to identify 17,387 orthologs of human protein-coding genes in the M. fascicularis draft genome, and the predicted transcripts enabled the design of a M. fascicularis-specific gene expression microarray. Using liver samples from 36 individuals of different geographic origin we identified 718 genes with highly variable expression in liver, whereas the majority of the transcriptome shows relatively stable and comparable expression. Knowledge of the M. fascicularis draft genome is an important contribution to both the use of this animal in disease models and the safety assessment of drugs and their metabolites. In particular, this information allows high-resolution genotyping and microarray-based gene-expression profiling for animal stratification, thereby allowing the use of well-characterized animals for safety testing. Finally, the genome sequence presented here is a significant contribution to the global "3R" animal welfare initiative, which has the goal to reduce, refine, and replace animal experiments.
Asunto(s)
Evaluación Preclínica de Medicamentos , Macaca fascicularis/genética , Modelos Animales , Animales , Sistema Enzimático del Citocromo P-450/genética , Citocinas/genética , ADN/genética , ADN/aislamiento & purificación , Femenino , Perfilación de la Expresión Génica/métodos , Genoma , Secuenciación de Nucleótidos de Alto Rendimiento , Humanos , Hígado/metabolismo , Análisis de Secuencia por Matrices de Oligonucleótidos/métodos , Transportadores de Anión Orgánico/genética , Filogenia , Polimorfismo de Nucleótido Simple , Análisis de Secuencia de ADN , Homología de Secuencia de Ácido Nucleico , Transcripción GenéticaRESUMEN
BACKGROUND: Whole transcriptome analyses are an essential tool for understanding disease mechanisms. Approaches based on next-generation sequencing provide fast and affordable data but rely on the availability of annotated genomes. However, there are many areas in biomedical research that require non-standard animal models for which genome information is not available. This includes the Syrian hamster Mesocricetus auratus as an important model for dyslipidaemia because it mirrors many aspects of human disease and pharmacological responses. We show that complementary use of two independent next generation sequencing technologies combined with mapping to multiple genome databases allows unambiguous transcript annotation and quantitative transcript imaging. We refer to this approach as "triple match sequencing" (TMS). RESULTS: Contigs assembled from a normalized Roche 454 hamster liver library comprising 1.2 million long reads were used to identify 10'800 unique transcripts based on homology to RefSeq database entries from human, mouse, and rat. For mRNA quantification we mapped 82 million SAGE tags (SOLiD) from the same RNA source to the annotated hamster liver transcriptome contigs. We compared the liver transcriptome of hamster with equivalent data from human, rat, minipig, and cynomolgus monkeys to highlight differential gene expression with focus on lipid metabolism. We identify a cluster of five genes functionally related to HDL metabolism that is expressed in human, cynomolgus, minipig, and hamster but lacking in rat as a non-responder species for lipid lowering drugs. CONCLUSIONS: The TMS approach is suited for fast and inexpensive transcript profiling in cells or tissues of species where a fully annotated genome is not available. The continuously growing number of well annotated reference genomes will further empower reliable transcript identification and thereby raise the utility of the method for any species of interest.
Asunto(s)
Metabolismo de los Lípidos/genética , Hígado/metabolismo , Mesocricetus/genética , Animales , Cricetinae , Bases de Datos Genéticas , Perfilación de la Expresión Génica/métodos , Humanos , Macaca fascicularis/genética , Masculino , Análisis de Secuencia por Matrices de Oligonucleótidos/métodos , ARN Mensajero/genética , Ratas , Sus scrofa/genéticaRESUMEN
BACKGROUND: Theileria parva is a tick-borne protozoan parasite, which causes East Coast Fever, a disease of cattle in sub-Saharan Africa. Like Plasmodium falciparum, the parasite undergoes a transient diploid life-cycle stage in the gut of the arthropod vector, which involves an obligate sexual cycle. As assessed using low-resolution VNTR markers, the crossover (CO) rate in T. parva is relatively high and has been reported to vary across different regions of the genome; non-crossovers (NCOs) and CO-associated gene conversions have not yet been characterised due to the lack of informative markers. To examine all recombination events at high marker resolution, we sequenced the haploid genomes of two parental strains, and two recombinant clones derived from ticks fed on cattle that had been simultaneously co-infected with two different parasite isolates. RESULTS: By comparing the genome sequences, we were able to genotype over 64 thousand SNP markers with an average spacing of 127 bp in the two progeny clones. Previously unrecognized COs in sub-telomeric regions were detected. About 50% of CO breakpoints were accompanied by gene conversion events. Such a high fraction of COs accompanied by gene conversions demonstrated the contributions of meiotic recombination to the diversity and evolutionary success of T. parva, as the process not only redistributed existing genetic variations, but also altered allelic frequencies. Compared to COs, NCOs were more frequently observed and more uniformly distributed across the genome. In both progeny clones, genomic regions with more SNP markers had a reduced frequency of COs or NCOs, suggesting that the sequence divergence between the parental strains was high enough to adversely affect recombination frequencies. Intra-species polymorphism analysis identified 81 loci as likely to be under selection in the sequenced genomes. CONCLUSIONS: Using whole genome sequencing of two recombinant clones and their parents, we generated maps of COs, NCOs, and CO-associated gene conversion events for T. parva. The data comprises one of the highest-resolution genome-wide analyses of the multiple outcomes of meiotic recombination for this pathogen. The study also demonstrates the usefulness of high throughput sequencing typing for detailed analysis of recombination in organisms in which conventional genetic analysis is technically difficult.
Asunto(s)
Enfermedades de los Bovinos/parasitología , ADN Protozoario/genética , Theileria parva/genética , Garrapatas/parasitología , Animales , Vectores Artrópodos/parasitología , Secuencia de Bases , Bovinos , Mapeo Cromosómico , Intercambio Genético , Conversión Génica , Frecuencia de los Genes , Variación Genética , Genotipo , Técnicas de Genotipaje , Secuenciación de Nucleótidos de Alto Rendimiento , Polimorfismo de Nucleótido Simple , Recombinación Genética , Análisis de Secuencia de ADN , Theileria parva/aislamiento & purificación , Theileriosis/genética , Theileriosis/parasitologíaRESUMEN
Current animal-free methods to assess teratogenicity of drugs under development still deliver high numbers of false negatives. To improve the sensitivity of human teratogenicity prediction, we characterized the TeraTox test, a newly developed multilineage differentiation assay using 3D human-induced pluripotent stem cells. TeraTox produces primary output concentration-dependent cytotoxicity and altered gene expression induced by each test compound. These data are fed into an interpretable machine-learning model to perform prediction, which relates to the concentration-dependent human teratogenicity potential of drug candidates. We applied TeraTox to profile 33 approved pharmaceuticals and 12 proprietary drug candidates with known in vivo data. Comparing TeraTox predictions with known human or animal toxicity, we report an accuracy of 69% (specificity: 53%, sensitivity: 79%). TeraTox performed better than 2 quantitative structure-activity relationship models and had a higher sensitivity than the murine embryonic stem cell test (accuracy: 58%, specificity: 76%, and sensitivity: 46%) run in the same laboratory. The overall prediction accuracy could be further improved by combining TeraTox and mouse embryonic stem cell test results. Furthermore, patterns of altered gene expression revealed by TeraTox may help grouping toxicologically similar compounds and possibly deducing common modes of action. The TeraTox assay and the dataset described here therefore represent a new tool and a valuable resource for drug teratogenicity assessment.
Asunto(s)
Células Madre Pluripotentes Inducidas , Teratogénesis , Animales , Bioensayo/métodos , Diferenciación Celular , Células Madre Embrionarias/metabolismo , RatonesRESUMEN
The safety of most human recombinant proteins can be evaluated in transgenic mice tolerant to specific human proteins. However, owing to insufficient genetic diversity and to fundamental differences in immune mechanisms, small-animal models of human diseases are often unsuitable for immunogenicity testing and for predicting adverse outcomes in human patients. Most human therapeutic antibodies trigger xenogeneic responses in wild-type animals and thus rapid clearance of the drugs, which makes in vivo toxicological testing of human antibodies challenging. Here we report the generation of Göttingen minipigs carrying a mini-repertoire of human genes for the immunoglobulin heavy chains γ1 and γ4 and the immunoglobulin light chain κ. In line with observations in human patients, the genetically modified minipigs tolerated the clinically non-immunogenic IgG1κ-isotype monoclonal antibodies daratumumab and bevacizumab, and elicited antibodies against the checkpoint inhibitor atezolizumab and the engineered interleukin cergutuzumab amunaleukin. The humanized minipigs can facilitate the safety and efficacy testing of therapeutic antibodies.
Asunto(s)
Cadenas Pesadas de Inmunoglobulina , Ratones , Humanos , Animales , Porcinos , Porcinos Enanos , Cadenas Pesadas de Inmunoglobulina/genética , Proteínas Recombinantes , Ratones TransgénicosRESUMEN
Today, novel therapeutics are identified in an environment which is intrinsically different from the clinical context in which they are ultimately evaluated. Using molecular phenotyping and an in vitro model of diabetic cardiomyopathy, we show that by quantifying pathway reporter gene expression, molecular phenotyping can cluster compounds based on pathway profiles and dissect associations between pathway activities and disease phenotypes simultaneously. Molecular phenotyping was applicable to compounds with a range of binding specificities and triaged false positives derived from high-content screening assays. The technique identified a class of calcium-signaling modulators that can reverse disease-regulated pathways and phenotypes, which was validated by structurally distinct compounds of relevant classes. Our results advocate for application of molecular phenotyping in early drug discovery, promoting biological relevance as a key selection criterion early in the drug development cascade.
Asunto(s)
Biología Computacional/métodos , Descubrimiento de Drogas/métodos , Fenotipo , Minería de Datos , Evaluación Preclínica de Medicamentos , HumanosRESUMEN
In the pharmaceutical industry, knowledge of the three-dimensional structure of a specific target facilitates the drug-discovery process. Despite possessing favoured analytical properties such as high purity and monodispersion in light scattering, some proteins are not capable of forming crystals suitable for X-ray analysis. Cyclophilin D, an isoform of cyclophilin that is expressed in the mitochondria, was selected as a drug target for the treatment of cardiac disorders. As the wild-type enzyme defied all attempts at crystallization, protein engineering on the enzyme surface was performed. The K133I mutant gave crystals that diffracted to 1.7 A resolution using in-house X-ray facilities and were suitable for soaking experiments. The crystals were very robust and diffraction was maintained after soaking in 25% DMSO solution: excellent conditions for the rapid analysis of complex structures including crystallographic fragment screening.
Asunto(s)
Ciclofilinas/química , Cristalización , Cristalografía por Rayos X , Peptidil-Prolil Isomerasa F , Ciclofilinas/genética , Ciclofilinas/aislamiento & purificación , ADN/genética , ADN/aislamiento & purificación , Dimetilsulfóxido , Luz , Mutagénesis Sitio-Dirigida , Isomerasa de Peptidilprolil/química , Plásmidos/genética , Mutación Puntual/genética , Ingeniería de Proteínas , Reacción en Cadena de la Polimerasa de Transcriptasa Inversa , Dispersión de Radiación , Solventes , Espectrometría de Fluorescencia , UltracentrifugaciónRESUMEN
The full-length and ectodomain forms of beta-site APP cleavage enzyme (BACE) have been cloned, expressed in Sf9 cells, and purified to homogeneity. This aspartic protease cleaves the amyloid precursor protein at the beta-secretase site, a critical step in the Alzheimer's disease pathogenesis. Comparison of BACE to other aspartic proteases such as cathepsin D and E, napsin A, pepsin, and renin revealed little similarity with respect to the substrate preference and inhibitor profile. On the other hand, these parameters are all very similar for the homologous enzyme BACE2. Based on a collection of decameric substrates, it was found that BACE has a loose substrate specificity and that the substrate recognition site in BACE extends over several amino acids. In common with the aspartic proteases mentioned above, BACE prefers a leucine residue at position P1. Unlike cathepsin D etc., BACE accepts polar or acidic residues at positions P2'0 and P1 but prefers bulky hydrophobic residues at position P3. BACE displays poor kinetic constants toward its known substrates (wild-type substrate, SEVKM/DAEFR, K(m) = 7 microm, K(cat) = 0.002 s(-1); Swedish mutant, SEVNL/DAEFR, K(m) = 9 microm, K(cat) = 0.02 s(-1)). A new substrate (VVEVDA/AVTP, K(m) = 1 microm, K(cat) = 0.004) was identified by serendipity.