Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 22
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
1.
Bioinformatics ; 39(8)2023 08 01.
Artigo em Inglês | MEDLINE | ID: mdl-37527006

RESUMO

MOTIVATION: Read alignment is an essential first step in the characterization of DNA sequence variation. The accuracy of variant-calling results depends not only on the quality of read alignment and variant-calling software but also on the interaction between these complex software tools. RESULTS: In this review, we evaluate short-read aligner performance with the goal of optimizing germline variant-calling accuracy. We examine the performance of three general-purpose short-read aligners-BWA-MEM, Bowtie 2, and Arioc-in conjunction with three germline variant callers: DeepVariant, FreeBayes, and GATK HaplotypeCaller. We discuss the behavior of the read aligners with regard to the data elements on which the variant callers rely, and illustrate how the runtime configurations of these software tools combine to affect variant-calling performance. AVAILABILITY AND IMPLEMENTATION: The quick brown fox jumps over the lazy dog.


Assuntos
Sequenciamento de Nucleotídeos em Larga Escala , Software , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Células Germinativas , Análise de Sequência de DNA/métodos
2.
Cancers (Basel) ; 15(12)2023 Jun 08.
Artigo em Inglês | MEDLINE | ID: mdl-37370719

RESUMO

Multispectral, multiplex immunofluorescence (mIF) microscopy has been used to great effect in research to identify cellular co-expression profiles and spatial relationships within tissue, providing a myriad of diagnostic advantages. As these technologies mature, it is essential that image data from mIF microscopes is reproducible and standardizable across devices. We sought to characterize and correct differences in illumination intensity and spectral sensitivity between three multispectral microscopes. We scanned eight melanoma tissue samples twice on each microscope and calculated their average tissue region flux intensities. We found a baseline average standard deviation of 29.9% across all microscopes, scans, and samples, which was reduced to 13.9% after applying sample-specific corrections accounting for differences in the tissue shown on each slide. We used a basic calibration model to correct sample- and microscope-specific effects on overall brightness and relative brightness as a function of the image layer. We tested the generalizability of the calibration procedure and found that applying corrections to independent validation subsets of the samples reduced the variation to 2.9 ± 0.03%. Variations in the unmixed marker expressions were reduced from 15.8% to 4.4% by correcting the raw images to a single reference microscope. Our findings show that mIF microscopes can be standardized for use in clinical pathology laboratories using a relatively simple correction model.

3.
Lab Invest ; 103(8): 100175, 2023 08.
Artigo em Inglês | MEDLINE | ID: mdl-37196983

RESUMO

Multiplex immunohistochemistry/immunofluorescence (mIHC/mIF) is a developing technology that facilitates the evaluation of multiple, simultaneous protein expressions at single-cell resolution while preserving tissue architecture. These approaches have shown great potential for biomarker discovery, yet many challenges remain. Importantly, streamlined cross-registration of multiplex immunofluorescence images with additional imaging modalities and immunohistochemistry (IHC) can help increase the plex and/or improve the quality of the data generated by potentiating downstream processes such as cell segmentation. To address this problem, a fully automated process was designed to perform a hierarchical, parallelizable, and deformable registration of multiplexed digital whole-slide images (WSIs). We generalized the calculation of mutual information as a registration criterion to an arbitrary number of dimensions, making it well suited for multiplexed imaging. We also used the self-information of a given IF channel as a criterion to select the optimal channels to use for registration. Additionally, as precise labeling of cellular membranes in situ is essential for robust cell segmentation, a pan-membrane immunohistochemical staining method was developed for incorporation into mIF panels or for use as an IHC followed by cross-registration. In this study, we demonstrate this process by registering whole-slide 6-plex/7-color mIF images with whole-slide brightfield mIHC images, including a CD3 and a pan-membrane stain. Our algorithm, WSI, mutual information registration (WSIMIR), performed highly accurate registration allowing the retrospective generation of an 8-plex/9-color, WSI, and outperformed 2 alternative automated methods for cross-registration by Jaccard index and Dice similarity coefficient (WSIMIR vs automated WARPY, P < .01 and P < .01, respectively, vs HALO + transformix, P = .083 and P = .049, respectively). Furthermore, the addition of a pan-membrane IHC stain cross-registered to an mIF panel facilitated improved automated cell segmentation across mIF WSIs, as measured by significantly increased correct detections, Jaccard index (0.78 vs 0.65), and Dice similarity coefficient (0.88 vs 0.79).


Assuntos
Corantes , Diagnóstico por Imagem , Imuno-Histoquímica , Estudos Retrospectivos , Imunofluorescência , Membrana Celular
4.
Front Artif Intell ; 6: 1116870, 2023.
Artigo em Inglês | MEDLINE | ID: mdl-36925616

RESUMO

The brain is arguably the most powerful computation system known. It is extremely efficient in processing large amounts of information and can discern signals from noise, adapt, and filter faulty information all while running on only 20 watts of power. The human brain's processing efficiency, progressive learning, and plasticity are unmatched by any computer system. Recent advances in stem cell technology have elevated the field of cell culture to higher levels of complexity, such as the development of three-dimensional (3D) brain organoids that recapitulate human brain functionality better than traditional monolayer cell systems. Organoid Intelligence (OI) aims to harness the innate biological capabilities of brain organoids for biocomputing and synthetic intelligence by interfacing them with computer technology. With the latest strides in stem cell technology, bioengineering, and machine learning, we can explore the ability of brain organoids to compute, and store given information (input), execute a task (output), and study how this affects the structural and functional connections in the organoids themselves. Furthermore, understanding how learning generates and changes patterns of connectivity in organoids can shed light on the early stages of cognition in the human brain. Investigating and understanding these concepts is an enormous, multidisciplinary endeavor that necessitates the engagement of both the scientific community and the public. Thus, on Feb 22-24 of 2022, the Johns Hopkins University held the first Organoid Intelligence Workshop to form an OI Community and to lay out the groundwork for the establishment of OI as a new scientific discipline. The potential of OI to revolutionize computing, neurological research, and drug development was discussed, along with a vision and roadmap for its development over the coming decade.

5.
Clin Cancer Res ; 28(16): 3417-3424, 2022 08 15.
Artigo em Inglês | MEDLINE | ID: mdl-35522154

RESUMO

Astronomy was among the first disciplines to embrace Big Data and use it to characterize spatial relationships between stars and galaxies. Today, medicine, in particular pathology, has similar needs with regard to characterizing the spatial relationships between cells, with an emphasis on understanding the organization of the tumor microenvironment. In this article, we chronicle the emergence of data-intensive science through the development of the Sloan Digital Sky Survey and describe how analysis patterns and approaches similarly apply to multiplex immunofluorescence (mIF) pathology image exploration. The lessons learned from astronomy are detailed, and the new AstroPath platform that capitalizes on these learnings is described. AstroPath is being used to generate and display tumor-immune maps that can be used for mIF immuno-oncology biomarker development. The development of AstroPath as an open resource for visualizing and analyzing large-scale spatially resolved mIF datasets is underway, akin to how publicly available maps of the sky have been used by astronomers and citizen scientists alike. Associated technical, academic, and funding considerations, as well as extended future development for inclusion of spatial transcriptomics and application of artificial intelligence, are also addressed.


Assuntos
Astronomia , Neoplasias , Inteligência Artificial , Astronomia/métodos , Imunofluorescência , Humanos , Neoplasias/diagnóstico , Neoplasias/genética , Microambiente Tumoral
6.
ACS Omega ; 7(16): 13398-13402, 2022 Apr 26.
Artigo em Inglês | MEDLINE | ID: mdl-35505822

RESUMO

Research organizations are critically in need of directed growth toward future interoperability and federation. The purpose of this Viewpoint is to alert the government, academia, professional societies, foundations, and industries of a further need for consideration of data in chemistry and materials as a long-term and sustained development in the US. This paper is a call for coordinated action from the government, academia, and industry to establish a national strategy and concomitant infrastructure focused on research data.

7.
Mol Psychiatry ; 27(4): 2061-2067, 2022 04.
Artigo em Inglês | MEDLINE | ID: mdl-35236959

RESUMO

Antipsychotic drugs are the current first-line of treatment for schizophrenia and other psychotic conditions. However, their molecular effects on the human brain are poorly studied, due to difficulty of tissue access and confounders associated with disease status. Here we examine differences in gene expression and DNA methylation associated with positive antipsychotic drug toxicology status in the human caudate nucleus. We find no genome-wide significant differences in DNA methylation, but abundant differences in gene expression. These gene expression differences are overall quite similar to gene expression differences between schizophrenia cases and controls. Interestingly, gene expression differences based on antipsychotic toxicology are different between brain regions, potentially due to affected cell type differences. We finally assess similarities with effects in a mouse model, which finds some overlapping effects but many differences as well. As a first look at the molecular effects of antipsychotics in the human brain, the lack of epigenetic effects is unexpected, possibly because long term treatment effects may be relatively stable for extended periods.


Assuntos
Antipsicóticos , Transtornos Psicóticos , Esquizofrenia , Animais , Antipsicóticos/farmacologia , Antipsicóticos/uso terapêutico , Núcleo Caudado , Humanos , Camundongos , Fenótipo , Transtornos Psicóticos/tratamento farmacológico , Esquizofrenia/tratamento farmacológico , Esquizofrenia/genética
8.
Bioinformatics ; 38(8): 2081-2087, 2022 04 12.
Artigo em Inglês | MEDLINE | ID: mdl-35139149

RESUMO

SUMMARY: Over the past decade, short-read sequence alignment has become a mature technology. Optimized algorithms, careful software engineering and high-speed hardware have contributed to greatly increased throughput and accuracy. With these improvements, many opportunities for performance optimization have emerged. In this review, we examine three general-purpose short-read alignment tools-BWA-MEM, Bowtie 2 and Arioc-with a focus on performance optimization. We analyze the performance-related behavior of the algorithms and heuristics each tool implements, with the goal of arriving at practical methods of improving processing speed and accuracy. We indicate where an aligner's default behavior may result in suboptimal performance, explore the effects of computational constraints such as end-to-end mapping and alignment scoring threshold, and discuss sources of imprecision in the computation of alignment scores and mapping quality. With this perspective, we describe an approach to tuning short-read aligner performance to meet specific data-analysis and throughput requirements while avoiding potential inaccuracies in subsequent analysis of alignment results. Finally, we illustrate how this approach avoids easily overlooked pitfalls and leads to verifiable improvements in alignment speed and accuracy. CONTACT: richard.wilton@jhu.edu. SUPPLEMENTARY INFORMATION: Appendices referenced in this article are available at Bioinformatics online.


Assuntos
Sequenciamento de Nucleotídeos em Larga Escala , Software , Análise de Sequência de DNA/métodos , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Algoritmos , Alinhamento de Sequência
9.
Nat Commun ; 12(1): 5251, 2021 09 02.
Artigo em Inglês | MEDLINE | ID: mdl-34475392

RESUMO

DNA methylation (DNAm) is an epigenetic regulator of gene expression and a hallmark of gene-environment interaction. Using whole-genome bisulfite sequencing, we have surveyed DNAm in 344 samples of human postmortem brain tissue from neurotypical subjects and individuals with schizophrenia. We identify genetic influence on local methylation levels throughout the genome, both at CpG sites and CpH sites, with 86% of SNPs and 55% of CpGs being part of methylation quantitative trait loci (meQTLs). These associations can further be clustered into regions that are differentially methylated by a given SNP, highlighting the genes and regions with which these loci are epigenetically associated. These findings can be used to better characterize schizophrenia GWAS-identified variants as epigenetic risk variants. Regions differentially methylated by schizophrenia risk-SNPs explain much of the heritability associated with risk loci, despite covering only a fraction of the genomic space. We provide a comprehensive, single base resolution view of association between genetic variation and genomic methylation, and implicate schizophrenia GWAS-associated variants as influencing the epigenetic plasticity of the brain.


Assuntos
Metilação de DNA , Genoma Humano , Locos de Características Quantitativas/genética , Esquizofrenia/genética , Fatores Etários , Encéfalo/metabolismo , Encéfalo/patologia , Ilhas de CpG/genética , Epigênese Genética , Predisposição Genética para Doença/genética , Variação Genética , Estudo de Associação Genômica Ampla , Genótipo , Humanos , Polimorfismo de Nucleotídeo Único
10.
Science ; 372(6547)2021 06 11.
Artigo em Inglês | MEDLINE | ID: mdl-34112666

RESUMO

Next-generation tissue-based biomarkers for immunotherapy will likely include the simultaneous analysis of multiple cell types and their spatial interactions, as well as distinct expression patterns of immunoregulatory molecules. Here, we introduce a comprehensive platform for multispectral imaging and mapping of multiple parameters in tumor tissue sections with high-fidelity single-cell resolution. Image analysis and data handling components were drawn from the field of astronomy. Using this "AstroPath" whole-slide platform and only six markers, we identified key features in pretreatment melanoma specimens that predicted response to anti-programmed cell death-1 (PD-1)-based therapy, including CD163+PD-L1- myeloid cells and CD8+FoxP3+PD-1low/mid T cells. These features were combined to stratify long-term survival after anti-PD-1 blockade. This signature was validated in an independent cohort of patients with melanoma from a different institution.


Assuntos
Antineoplásicos Imunológicos/uso terapêutico , Biomarcadores Tumorais/análise , Imunofluorescência , Melanoma/tratamento farmacológico , Receptor de Morte Celular Programada 1/antagonistas & inibidores , Adulto , Idoso , Idoso de 80 Anos ou mais , Antígenos CD/análise , Antígenos de Diferenciação Mielomonocítica/análise , Antígeno B7-H1/análise , Antígenos CD8/análise , Feminino , Fatores de Transcrição Forkhead/análise , Humanos , Proteínas de Checkpoint Imunológico/análise , Macrófagos/química , Masculino , Melanoma/química , Melanoma/imunologia , Melanoma/patologia , Pessoa de Meia-Idade , Prognóstico , Receptor de Morte Celular Programada 1/análise , Intervalo Livre de Progressão , Receptores de Superfície Celular/análise , Fatores de Transcrição SOXE/análise , Análise de Célula Única , Subpopulações de Linfócitos T/química , Subpopulações de Linfócitos T/imunologia , Resultado do Tratamento , Microambiente Tumoral
11.
Epigenetics ; 16(1): 1-13, 2021 01.
Artigo em Inglês | MEDLINE | ID: mdl-32602773

RESUMO

DNA methylation (DNAm) is a key epigenetic regulator of gene expression across development. The developing prenatal brain is a highly dynamic tissue, but our understanding of key drivers of epigenetic variability across development is limited. We, therefore, assessed genomic methylation at over 39 million sites in the prenatal cortex using whole-genome bisulfite sequencing and found loci and regions in which methylation levels are dynamic across development. We saw that DNAm at these loci was associated with nearby gene expression and enriched for enhancer chromatin states in prenatal brain tissue. Additionally, these loci were enriched for genes associated with neuropsychiatric disorders and genes involved with neurogenesis. We also found autosomal differences in DNAm between the sexes during prenatal development, though these have less clear functional consequences. We lastly confirmed that the dynamic methylation at this critical period is specifically CpG methylation, with generally low levels of CpH methylation. Our findings provide detailed insight into prenatal brain development as well as clues to the pathogenesis of psychiatric traits seen later in life.


Assuntos
Córtex Cerebral/metabolismo , Metilação de DNA , Córtex Cerebral/embriologia , Ilhas de CpG , Epigênese Genética , Epigenoma , Feminino , Feto/metabolismo , Loci Gênicos , Humanos , Masculino
12.
Front Physiol ; 11: 583333, 2020.
Artigo em Inglês | MEDLINE | ID: mdl-33192595

RESUMO

Overwhelming evidence has shown the significant role of the tumor microenvironment (TME) in governing the triple-negative breast cancer (TNBC) progression. Digital pathology can provide key information about the spatial heterogeneity within the TME using image analysis and spatial statistics. These analyses have been applied to CD8+ T cells, but quantitative analyses of other important markers and their correlations are limited. In this study, a digital pathology computational workflow is formulated for characterizing the spatial distributions of five immune markers (CD3, CD4, CD8, CD20, and FoxP3) and then the functionality is tested on whole slide images from patients with TNBC. The workflow is initiated by digital image processing to extract and colocalize immune marker-labeled cells and then convert this information to point patterns. Afterward invasive front (IF), central tumor (CT), and normal tissue (N) are characterized. For each region, we examine the intra-tumoral heterogeneity. The workflow is then repeated for all specimens to capture inter-tumoral heterogeneity. In this study, both intra- and inter-tumoral heterogeneities are observed for all five markers across all specimens. Among all regions, IF tends to have higher densities of immune cells and overall larger variations in spatial model fitting parameters and higher density in cell clusters and hotspots compared to CT and N. Results suggest a distinct role of IF in the tumor immuno-architecture. Though the sample size is limited in the study, the computational workflow could be readily reproduced and scaled due to its automatic nature. Importantly, the value of the workflow also lies in its potential to be linked to treatment outcomes and identification of predictive biomarkers for responders/non-responders, and its application to parameterization and validation of computational immuno-oncology models.

13.
PLoS Comput Biol ; 16(11): e1008383, 2020 11.
Artigo em Inglês | MEDLINE | ID: mdl-33166275

RESUMO

In large DNA sequence repositories, archival data storage is often coupled with computers that provide 40 or more CPU threads and multiple GPU (general-purpose graphics processing unit) devices. This presents an opportunity for DNA sequence alignment software to exploit high-concurrency hardware to generate short-read alignments at high speed. Arioc, a GPU-accelerated short-read aligner, can compute WGS (whole-genome sequencing) alignments ten times faster than comparable CPU-only alignment software. When two or more GPUs are available, Arioc's speed increases proportionately because the software executes concurrently on each available GPU device. We have adapted Arioc to recent multi-GPU hardware architectures that support high-bandwidth peer-to-peer memory accesses among multiple GPUs. By modifying Arioc's implementation to exploit this GPU memory architecture we obtained a further 1.8x-2.9x increase in overall alignment speeds. With this additional acceleration, Arioc computes two million short-read alignments per second in a four-GPU system; it can align the reads from a human WGS sequencer run-over 500 million 150nt paired-end reads-in less than 15 minutes. As WGS data accumulates exponentially and high-concurrency computational resources become widespread, Arioc addresses a growing need for timely computation in the short-read data analysis toolchain.


Assuntos
Alinhamento de Sequência/métodos , Software , Algoritmos , Sequência de Bases , Biologia Computacional , Gráficos por Computador , Computadores , Bases de Dados de Ácidos Nucleicos , Humanos , Armazenamento e Recuperação da Informação , Alinhamento de Sequência/estatística & dados numéricos , Análise de Sequência de DNA , Sequenciamento Completo do Genoma
14.
Bioinformatics ; 35(4): 665-670, 2019 02 15.
Artigo em Inglês | MEDLINE | ID: mdl-30052772

RESUMO

MOTIVATION: DNA sequencing archives have grown to enormous scales in recent years, and thousands of human genomes have already been sequenced. The size of these data sets has made searching the raw read data infeasible without high-performance data-query technology. Additionally, it is challenging to search a repository of short-read data using relational logic and to apply that logic across samples from multiple whole-genome sequencing samples. RESULTS: We have built a compact, efficiently-indexed database that contains the raw read data for over 250 human genomes, encompassing trillions of bases of DNA, and that allows users to search these data in real-time. The Terabase Search Engine enables retrieval from this database of all the reads for any genomic location in a matter of seconds. Users can search using a range of positions or a specific sequence that is aligned to the genome on the fly. AVAILABILITY AND IMPLEMENTATION: Public access to the Terabase Search Engine database is available at http://tse.idies.jhu.edu. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Bases de Dados Genéticas , Ferramenta de Busca , Software , Genoma Humano , Genômica , Humanos , Análise de Sequência de DNA
15.
Bioinformatics ; 34(15): 2673-2675, 2018 08 01.
Artigo em Inglês | MEDLINE | ID: mdl-29554207

RESUMO

Motivation: The alignment of bisulfite-treated DNA sequences (BS-seq reads) to a large genome involves a significant computational burden beyond that required to align non-bisulfite-treated reads. In the analysis of BS-seq data, this can present an important performance bottleneck that can be mitigated by appropriate algorithmic and software-engineering improvements. One strategy is to modify the read-alignment algorithms by integrating the logic related to BS-seq alignment, with the goal of making the software implementation amenable to optimizations that lead to higher speed and greater sensitivity than might otherwise be attainable. Results: We evaluated this strategy using Arioc, a short-read aligner that uses GPU (general-purpose graphics processing unit) hardware to accelerate computationally-expensive programming logic. We integrated the BS-seq computational logic into both GPU and CPU code throughout the Arioc implementation. We then carried out a read-by-read comparison of Arioc's reported alignments with the alignments reported by well-known CPU-based BS-seq read aligners. With simulated reads, Arioc's accuracy is equal to or better than the other read aligners we evaluated. With human sequencing reads, Arioc's throughput is at least 10 times faster than existing BS-seq aligners across a wide range of sensitivity settings. Availability and implementation: The Arioc software is available for download at https://github.com/RWilton/Arioc. It is released under a BSD open-source license. Supplementary information: Supplementary data are available at Bioinformatics online.


Assuntos
Genômica/métodos , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Análise de Sequência de DNA/métodos , Software , Algoritmos , Humanos , Sulfitos
16.
PeerJ ; 3: e808, 2015.
Artigo em Inglês | MEDLINE | ID: mdl-25780763

RESUMO

When computing alignments of DNA sequences to a large genome, a key element in achieving high processing throughput is to prioritize locations in the genome where high-scoring mappings might be expected. We formulated this task as a series of list-processing operations that can be efficiently performed on graphics processing unit (GPU) hardware.We followed this approach in implementing a read aligner called Arioc that uses GPU-based parallel sort and reduction techniques to identify high-priority locations where potential alignments may be found. We then carried out a read-by-read comparison of Arioc's reported alignments with the alignments found by several leading read aligners. With simulated reads, Arioc has comparable or better accuracy than the other read aligners we tested. With human sequencing reads, Arioc demonstrates significantly greater throughput than the other aligners we evaluated across a wide range of sensitivity settings. The Arioc software is available at https://github.com/RWilton/Arioc. It is released under a BSD open-source license.

17.
Neuron ; 83(6): 1249-52, 2014 Sep 17.
Artigo em Inglês | MEDLINE | ID: mdl-25233306

RESUMO

The analysis of data requires computation: originally by hand and more recently by computers. Different models of computing are designed and optimized for different kinds of data. In data-intensive science, the scale and complexity of data exceeds the comfort zone of local data stores on scientific workstations. Thus, cloud computing emerges as the preeminent model, utilizing data centers and high-performance clusters, enabling remote users to access and query subsets of the data efficiently. We examine how data-intensive computational systems originally built for cosmology, the Sloan Digital Sky Survey (SDSS), are now being used in connectomics, at the Open Connectome Project. We list lessons learned and outline the top challenges we expect to face. Success in computational connectomics would drastically reduce the time between idea and discovery, as SDSS did in cosmology.


Assuntos
Computadores , Conectoma/métodos , Sistemas de Informação , Software , Estatística como Assunto/métodos , Animais , Biologia Computacional/métodos , Humanos
18.
Artigo em Inglês | MEDLINE | ID: mdl-24401992

RESUMO

We describe a scalable database cluster for the spatial analysis and annotation of high-throughput brain imaging data, initially for 3-d electron microscopy image stacks, but for time-series and multi-channel data as well. The system was designed primarily for workloads that build connectomes- neural connectivity maps of the brain-using the parallel execution of computer vision algorithms on high-performance compute clusters. These services and open-science data sets are publicly available at openconnecto.me. The system design inherits much from NoSQL scale-out and data-intensive computing architectures. We distribute data to cluster nodes by partitioning a spatial index. We direct I/O to different systems-reads to parallel disk arrays and writes to solid-state storage-to avoid I/O interference and maximize throughput. All programming interfaces are RESTful Web services, which are simple and stateless, improving scalability and usability. We include a performance evaluation of the production system, highlighting the effec-tiveness of spatial data organization.

19.
ICS ; 2013.
Artigo em Inglês | MEDLINE | ID: mdl-24402052

RESUMO

We describe a storage system that removes I/O bottlenecks to achieve more than one million IOPS based on a user-space file abstraction for arrays of commodity SSDs. The file abstraction refactors I/O scheduling and placement for extreme parallelism and non-uniform memory and I/O. The system includes a set-associative, parallel page cache in the user space. We redesign page caching to eliminate CPU overhead and lock-contention in non-uniform memory architecture machines. We evaluate our design on a 32 core NUMA machine with four, eight-core processors. Experiments show that our design delivers 1.23 million 512-byte read IOPS. The page cache realizes the scalable IOPS of Linux asynchronous I/O (AIO) and increases user-perceived I/O performance linearly with cache hit rates. The parallel, set-associative cache matches the cache hit rates of the global Linux page cache under real workloads.

20.
PLoS One ; 7(1): e29889, 2012.
Artigo em Inglês | MEDLINE | ID: mdl-22253817

RESUMO

In this paper, we use a statistical estimator developed in astrophysics to study the distribution and organization of features of the human genome. Using the human reference sequence we quantify the global distribution of CpG islands (CGI) in each chromosome and demonstrate that the organization of the CGI across a chromosome is non-random, exhibits surprisingly long range correlations (10 Mb) and varies significantly among chromosomes. These correlations of CGI summarize functional properties of the genome that are not captured when considering variation in any particular separate (and local) feature. The demonstration of the proposed methods to quantify the organization of CGI in the human genome forms the basis of future studies. The most illuminating of these will assess the potential impact on phenotypic variation of inter-individual variation in the organization of the functional features of the genome within and among chromosomes, and among individuals for particular chromosomes.


Assuntos
Ilhas de CpG/genética , Genoma Humano/genética , Sequência de Bases , Cromossomos Humanos/genética , Bases de Dados Genéticas , Humanos
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...