Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 73
Filtrar
1.
Immunity ; 2024 Jun 10.
Artigo em Inglês | MEDLINE | ID: mdl-38876098

RESUMO

Allogeneic T cell expansion is the primary determinant of graft-versus-host disease (GVHD), and current dogma dictates that this is driven by histocompatibility antigen disparities between donor and recipient. This paradigm represents a closed genetic system within which donor T cells interact with peptide-major histocompatibility complexes (MHCs), though clonal interrogation remains challenging due to the sparseness of the T cell repertoire. We developed a Bayesian model using donor and recipient T cell receptor (TCR) frequencies in murine stem cell transplant systems to define limited common expansion of T cell clones across genetically identical donor-recipient pairs. A subset of donor CD4+ T cell clonotypes differentially expanded in identical recipients and were microbiota dependent. Microbiota-specific T cells augmented GVHD lethality and could target microbial antigens presented by gastrointestinal epithelium during an alloreactive response. The microbiota serves as a source of cognate antigens that contribute to clonotypic T cell expansion and the induction of GVHD independent of donor-recipient genetics.

2.
JCI Insight ; 9(9)2024 May 08.
Artigo em Inglês | MEDLINE | ID: mdl-38716731

RESUMO

T cells are required for protective immunity against Mycobacterium tuberculosis. We recently described a cohort of Ugandan household contacts of tuberculosis cases who appear to "resist" M. tuberculosis infection (resisters; RSTRs) and showed that these individuals harbor IFN-γ-independent T cell responses to M. tuberculosis-specific peptide antigens. However, T cells also recognize nonprotein antigens via antigen-presenting systems that are independent of genetic background, known as donor-unrestricted T cells (DURTs). We used tetramer staining and flow cytometry to characterize the association between DURTs and "resistance" to M. tuberculosis infection. Peripheral blood frequencies of most DURT subsets were comparable between RSTRs and latently infected controls (LTBIs). However, we observed a 1.65-fold increase in frequency of MR1-restricted T (MR1T) cells among RSTRs in comparison with LTBIs. Single-cell RNA sequencing of 18,251 MR1T cells sorted from 8 donors revealed 5,150 clonotypes that expressed a common transcriptional program, the majority of which were private. Sequencing of the T cell receptor α/T cell receptor δ (TCRα/δ) repertoire revealed several DURT clonotypes were expanded among RSTRs, including 2 MR1T clonotypes that recognized mycobacteria-infected cells in a TCR-dependent manner. Overall, our data reveal unexpected donor-specific diversity in the TCR repertoire of human MR1T cells as well as associations between mycobacteria-reactive MR1T clonotypes and resistance to M. tuberculosis infection.


Assuntos
Mycobacterium tuberculosis , Humanos , Mycobacterium tuberculosis/imunologia , Uganda , Adulto , Masculino , Antígenos de Histocompatibilidade Menor/imunologia , Antígenos de Histocompatibilidade Menor/genética , Feminino , Tuberculose/imunologia , Tuberculose/microbiologia , Linfócitos T/imunologia , Tuberculose Latente/imunologia , Tuberculose Latente/microbiologia , Células Clonais/imunologia , Resistência à Doença/imunologia , Resistência à Doença/genética , Adulto Jovem , Antígenos de Histocompatibilidade Classe I
3.
Proc Natl Acad Sci U S A ; 121(6): e2300838121, 2024 Feb 06.
Artigo em Inglês | MEDLINE | ID: mdl-38300863

RESUMO

Proteins play a central role in biology from immune recognition to brain activity. While major advances in machine learning have improved our ability to predict protein structure from sequence, determining protein function from its sequence or structure remains a major challenge. Here, we introduce holographic convolutional neural network (H-CNN) for proteins, which is a physically motivated machine learning approach to model amino acid preferences in protein structures. H-CNN reflects physical interactions in a protein structure and recapitulates the functional information stored in evolutionary data. H-CNN accurately predicts the impact of mutations on protein stability and binding of protein complexes. Our interpretable computational model for protein structure-function maps could guide design of novel proteins with desired function.


Assuntos
Algoritmos , Redes Neurais de Computação , Proteínas/genética , Aprendizado de Máquina , Aminoácidos
4.
PLoS Comput Biol ; 19(11): e1011664, 2023 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-37983288

RESUMO

T cells rely on their T cell receptors (TCRs) to discern foreign antigens presented by human leukocyte antigen (HLA) proteins. The TCRs of an individual contain a record of this individual's past immune activities, such as immune response to infections or vaccines. Mining the TCR data may recover useful information or biomarkers for immune related diseases or conditions. Some TCRs are observed only in the individuals with certain HLA alleles, and thus characterizing TCRs requires a thorough understanding of TCR-HLA associations. The extensive diversity of HLA alleles and the rareness of some HLA alleles present a formidable challenge for this task. Existing methods either treat HLA as a categorical variable or represent an HLA by its alphanumeric name, and have limited ability to generalize to the HLAs that are not seen in the training process. To address this challenge, we propose a neural network-based method named Deep learning Prediction of TCR-HLA association (DePTH) to predict TCR-HLA associations based on their amino acid sequences. We demonstrate that DePTH is capable of making reasonable predictions for TCR-HLA associations, even when neither the HLA nor the TCR have been included in the training dataset. Furthermore, we establish that DePTH can be used to quantify the functional similarities among HLA alleles, and that these HLA similarities are associated with the survival outcomes of cancer patients who received immune checkpoint blockade treatments.


Assuntos
Antígenos HLA , Antígenos de Histocompatibilidade Classe I , Humanos , Antígenos HLA/genética , Antígenos HLA/química , Receptores de Antígenos de Linfócitos T/genética , Antígenos de Histocompatibilidade Classe II , Redes Neurais de Computação
5.
Nat Commun ; 14(1): 6746, 2023 10 24.
Artigo em Inglês | MEDLINE | ID: mdl-37875492

RESUMO

De novo protein design methods can create proteins with folds not yet seen in nature. These methods largely focus on optimizing the compatibility between the designed sequence and the intended conformation, without explicit consideration of protein folding pathways. Deeply knotted proteins, whose topologies may introduce substantial barriers to folding, thus represent an interesting test case for protein design. Here we report our attempts to design proteins with trefoil (31) and pentafoil (51) knotted topologies. We extended previously described algorithms for tandem repeat protein design in order to construct deeply knotted backbones and matching designed repeat sequences (N = 3 repeats for the trefoil and N = 5 for the pentafoil). We confirmed the intended conformation for the trefoil design by X ray crystallography, and we report here on this protein's structure, stability, and folding behaviour. The pentafoil design misfolded into an asymmetric structure (despite a 5-fold symmetric sequence); two of the four repeat-repeat units matched the designed backbone while the other two diverged to form local contacts, leading to a trefoil rather than pentafoil knotted topology. Our results also provide insights into the folding of knotted proteins.


Assuntos
Dobramento de Proteína , Proteínas , Conformação Proteica , Proteínas/genética , Proteínas/química , Domínios Proteicos , Sequências de Repetição em Tandem/genética
6.
bioRxiv ; 2023 May 26.
Artigo em Inglês | MEDLINE | ID: mdl-37293077

RESUMO

T cells rely on their T cell receptors (TCRs) to recognize foreign antigens presented by human leukocyte antigen (HLA) proteins. TCRs contain a record of an individual's past immune activities, and some TCRs are observed only in individuals with certain HLA alleles. As a result, characterising TCRs requires a thorough understanding of TCR-HLA associations. To this end, we propose a neural network method named Deep learning Prediction of TCR-HLA association (DePTH) to predict TCR-HLA associations based on their amino acid sequences. We show that the DePTH can be used to quantify the functional similarities of HLA alleles, and that these HLA similarities are associated with the survival outcomes of cancer patients who received immune checkpoint blockade treatment.

7.
Elife ; 122023 05 25.
Artigo em Inglês | MEDLINE | ID: mdl-37227256

RESUMO

To appropriately defend against a wide array of pathogens, humans somatically generate highly diverse repertoires of B cell and T cell receptors (BCRs and TCRs) through a random process called V(D)J recombination. Receptor diversity is achieved during this process through both the combinatorial assembly of V(D)J-genes and the junctional deletion and insertion of nucleotides. While the Artemis protein is often regarded as the main nuclease involved in V(D)J recombination, the exact mechanism of nucleotide trimming is not understood. Using a previously published TCRß repertoire sequencing data set, we have designed a flexible probabilistic model of nucleotide trimming that allows us to explore various mechanistically interpretable sequence-level features. We show that local sequence context, length, and GC nucleotide content in both directions of the wider sequence, together, can most accurately predict the trimming probabilities of a given V-gene sequence. Because GC nucleotide content is predictive of sequence-breathing, this model provides quantitative statistical evidence regarding the extent to which double-stranded DNA may need to be able to breathe for trimming to occur. We also see evidence of a sequence motif that appears to get preferentially trimmed, independent of GC-content-related effects. Further, we find that the inferred coefficients from this model provide accurate prediction for V- and J-gene sequences from other adaptive immune receptor loci. These results refine our understanding of how the Artemis nuclease may function to trim nucleotides during V(D)J recombination and provide another step toward understanding how V(D)J recombination generates diverse receptors and supports a powerful, unique immune response in healthy humans.


Assuntos
Nucleotídeos , Recombinação V(D)J , Humanos , Nucleotídeos/metabolismo , Composição de Bases
8.
Proc Natl Acad Sci U S A ; 120(9): e2216697120, 2023 02 28.
Artigo em Inglês | MEDLINE | ID: mdl-36802421

RESUMO

Peptide-binding proteins play key roles in biology, and predicting their binding specificity is a long-standing challenge. While considerable protein structural information is available, the most successful current methods use sequence information alone, in part because it has been a challenge to model the subtle structural changes accompanying sequence substitutions. Protein structure prediction networks such as AlphaFold model sequence-structure relationships very accurately, and we reasoned that if it were possible to specifically train such networks on binding data, more generalizable models could be created. We show that placing a classifier on top of the AlphaFold network and fine-tuning the combined network parameters for both classification and structure prediction accuracy leads to a model with strong generalizable performance on a wide range of Class I and Class II peptide-MHC interactions that approaches the overall performance of the state-of-the-art NetMHCpan sequence-based method. The peptide-MHC optimized model shows excellent performance in distinguishing binding and non-binding peptides to SH3 and PDZ domains. This ability to generalize well beyond the training set far exceeds that of sequence-only models and should be particularly powerful for systems where less experimental data are available.


Assuntos
Antígenos de Histocompatibilidade Classe II , Peptídeos , Ligação Proteica , Peptídeos/química , Antígenos de Histocompatibilidade Classe II/metabolismo , Genes MHC da Classe II , Domínios PDZ
9.
Elife ; 122023 Jan 20.
Artigo em Inglês | MEDLINE | ID: mdl-36661395

RESUMO

The regulatory and effector functions of T cells are initiated by the binding of their cell-surface T cell receptor (TCR) to peptides presented by major histocompatibility complex (MHC) proteins on other cells. The specificity of TCR:peptide-MHC interactions, thus, underlies nearly all adaptive immune responses. Despite intense interest, generalizable predictive models of TCR:peptide-MHC specificity remain out of reach; two key barriers are the diversity of TCR recognition modes and the paucity of training data. Inspired by recent breakthroughs in protein structure prediction achieved by deep neural networks, we evaluated structural modeling as a potential avenue for prediction of TCR epitope specificity. We show that a specialized version of the neural network predictor AlphaFold can generate models of TCR:peptide-MHC interactions that can be used to discriminate correct from incorrect peptide epitopes with substantial accuracy. Although much work remains to be done for these predictions to have widespread practical utility, we are optimistic that deep learning-based structural modeling represents a path to generalizable prediction of TCR:peptide-MHC interaction specificity.


Assuntos
Peptídeos , Receptores de Antígenos de Linfócitos T , Receptores de Antígenos de Linfócitos T/metabolismo , Peptídeos/metabolismo , Linfócitos T , Epitopos/metabolismo , Ligação Proteica
10.
PLoS Comput Biol ; 18(12): e1010681, 2022 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-36476997

RESUMO

The complexity of entire T cell receptor (TCR) repertoires makes their comparison a difficult but important task. Current methods of TCR repertoire comparison can incur a high loss of distributional information by considering overly simplistic sequence- or repertoire-level characteristics. Optimal transport methods form a suitable approach for such comparison given some distance or metric between values in the sample space, with appealing theoretical and computational properties. In this paper we introduce a nonparametric approach to comparing empirical TCR repertoires that applies the Sinkhorn distance, a fast, contemporary optimal transport method, and a recently-created distance between TCRs called TCRdist. We show that our methods identify meaningful differences between samples from distinct TCR distributions for several case studies, and compete with more complicated methods despite minimal modeling assumptions and a simpler pipeline.

11.
Cell Stem Cell ; 29(9): 1346-1365.e10, 2022 09 01.
Artigo em Inglês | MEDLINE | ID: mdl-36055191

RESUMO

A hallmark of primate postimplantation embryogenesis is the specification of extraembryonic mesoderm (EXM) before gastrulation, in contrast to rodents where this tissue is formed only after gastrulation. Here, we discover that naive human pluripotent stem cells (hPSCs) are competent to differentiate into EXM cells (EXMCs). EXMCs are specified by inhibition of Nodal signaling and GSK3B, are maintained by mTOR and BMP4 signaling activity, and their transcriptome and epigenome closely resemble that of human and monkey embryo EXM. EXMCs are mesenchymal, can arise from an epiblast intermediate, and are capable of self-renewal. Thus, EXMCs arising via primate-specific specification between implantation and gastrulation can be modeled in vitro. We also find that most of the rare off-target cells within human blastoids formed by triple inhibition (Kagawa et al., 2021) correspond to EXMCs. Our study impacts our ability to model and study the molecular mechanisms of early human embryogenesis and related defects.


Assuntos
Células-Tronco Pluripotentes , Animais , Diferenciação Celular , Embrião de Mamíferos , Camadas Germinativas , Humanos , Mesoderma , Primatas
12.
Methods Mol Biol ; 2574: 367-388, 2022.
Artigo em Inglês | MEDLINE | ID: mdl-36087211

RESUMO

Advances in single-cell technologies have made it possible to simultaneously quantify gene expression and immune receptor sequence across thousands of individual T or B cells in a single experiment. Data from such experiments are advancing our understanding of the relationship between adaptive immune receptor sequence and transcriptional profile. We recently reported a software tool, CoNGA, specifically developed to detect correlation between receptor sequence and transcriptional profile. Here we describe in detail how CoNGA can be applied to analyze a large and diverse T cell dataset featuring multiple donors and batch annotations. Our workflow illustrates new analysis modes for the detection of TCR sequence convergence into similarity clusters and of matches to literature-derived TCR databases, as well as processing of gamma-delta T cells.


Assuntos
Receptores de Antígenos de Linfócitos T gama-delta , Software , Receptores de Antígenos de Linfócitos T gama-delta/genética
14.
Elife ; 112022 03 22.
Artigo em Inglês | MEDLINE | ID: mdl-35315770

RESUMO

Every T cell receptor (TCR) repertoire is shaped by a complex probabilistic tangle of genetically determined biases and immune exposures. T cells combine a random V(D)J recombination process with a selection process to generate highly diverse and functional TCRs. The extent to which an individual's genetic background is associated with their resulting TCR repertoire diversity has yet to be fully explored. Using a previously published repertoire sequencing dataset paired with high-resolution genome-wide genotyping from a large human cohort, we infer specific genetic loci associated with V(D)J recombination probabilities using genome-wide association inference. We show that V(D)J gene usage profiles are associated with variation in the TCRB locus and, specifically for the functional TCR repertoire, variation in the major histocompatibility complex locus. Further, we identify specific variations in the genes encoding the Artemis protein and the TdT protein to be associated with biasing junctional nucleotide deletion and N-insertion, respectively. These results refine our understanding of genetically-determined TCR repertoire biases by confirming and extending previous studies on the genetic determinants of V(D)J gene usage and providing the first examples of trans genetic variants which are associated with modifying junctional diversity. Together, these insights lay the groundwork for further explorations into how immune responses vary between individuals.


Assuntos
Estudo de Associação Genômica Ampla , Recombinação V(D)J , Loci Gênicos , Genótipo , Humanos , Probabilidade , Receptores de Antígenos de Linfócitos T/genética , Recombinação V(D)J/genética
15.
Proc Natl Acad Sci U S A ; 119(6)2022 02 08.
Artigo em Inglês | MEDLINE | ID: mdl-35105810

RESUMO

Competition between antigen-specific T cells for peptide:MHC complexes shapes the ensuing T cell response. Mouse model studies provided compelling evidence that competition is a highly effective mechanism controlling the activation of naïve T cells. However, assessing the effect of T cell competition in the context of a human infection requires defined pathogen kinetics and trackable naïve and memory T cell populations of defined specificity. A unique cohort of nonmyeloablative hematopoietic stem cell transplant patients allowed us to assess T cell competition in response to cytomegalovirus (CMV) reactivation, which was documented with detailed virology data. In our cohort, hematopoietic stem cell transplant donors and recipients were CMV seronegative and positive, respectively, thus providing genetically distinct memory and naïve T cell populations. We used single-cell transcriptomics to track donor versus recipient-derived T cell clones over the course of 90 d. We found that donor-derived T cell clones proliferated and expanded substantially following CMV reactivation. However, for immunodominant CMV epitopes, recipient-derived memory T cells remained the overall dominant population. This dominance was maintained despite more robust clonal expansion of donor-derived T cells in response to CMV reactivation. Interestingly, the donor-derived T cells that were recruited into these immunodominant memory populations shared strikingly similar TCR properties with the recipient-derived memory T cells. This selective recruitment of identical and nearly identical clones from the naïve into the immunodominant memory T cell pool suggests that competition is in place but does not interfere with rejuvenating a memory T cell population. Instead, it results in selection of convergent clones to the memory T cell pool.


Assuntos
Linfócitos T CD8-Positivos/imunologia , Infecções por Citomegalovirus/imunologia , Citomegalovirus/fisiologia , Transplante de Células-Tronco Hematopoéticas , Células T de Memória/imunologia , Doadores de Tecidos , Ativação Viral/imunologia , Adulto , Idoso , Idoso de 80 Anos ou mais , Feminino , Humanos , Masculino , Pessoa de Meia-Idade
16.
Nat Biotechnol ; 40(1): 54-63, 2022 01.
Artigo em Inglês | MEDLINE | ID: mdl-34426704

RESUMO

Links between T cell clonotypes, as defined by T cell receptor (TCR) sequences, and phenotype, as reflected in gene expression (GEX) profiles, surface protein expression and peptide:major histocompatibility complex binding, can reveal functional relationships beyond the features shared by clonally related cells. Here we present clonotype neighbor graph analysis (CoNGA), a graph theoretic approach that identifies correlations between GEX profile and TCR sequence through statistical analysis of GEX and TCR similarity graphs. Using CoNGA, we uncovered associations between TCR sequence and GEX profiles that include a previously undescribed 'natural lymphocyte' population of human circulating CD8+ T cells and a set of TCR sequence determinants of differentiation in thymocytes. These examples show that CoNGA might help elucidate complex relationships between TCR sequence and T cell phenotype in large, heterogeneous, single-cell datasets.


Assuntos
Linfócitos T CD8-Positivos , Receptores de Antígenos de Linfócitos T alfa-beta , Linfócitos T CD8-Positivos/metabolismo , Diferenciação Celular , Receptores de Antígenos de Linfócitos T/genética , Receptores de Antígenos de Linfócitos T/metabolismo , Receptores de Antígenos de Linfócitos T alfa-beta/genética
17.
Elife ; 102021 11 30.
Artigo em Inglês | MEDLINE | ID: mdl-34845983

RESUMO

T-cell receptors (TCRs) encode clinically valuable information that reflects prior antigen exposure and potential future response. However, despite advances in deep repertoire sequencing, enormous TCR diversity complicates the use of TCR clonotypes as clinical biomarkers. We propose a new framework that leverages experimentally inferred antigen-associated TCRs to form meta-clonotypes - groups of biochemically similar TCRs - that can be used to robustly quantify functionally similar TCRs in bulk repertoires across individuals. We apply the framework to TCR data from COVID-19 patients, generating 1831 public TCR meta-clonotypes from the SARS-CoV-2 antigen-associated TCRs that have strong evidence of restriction to patients with a specific human leukocyte antigen (HLA) genotype. Applied to independent cohorts, meta-clonotypes targeting these specific epitopes were more frequently detected in bulk repertoires compared to exact amino acid matches, and 59.7% (1093/1831) were more abundant among COVID-19 patients that expressed the putative restricting HLA allele (false discovery rate [FDR]<0.01), demonstrating the potential utility of meta-clonotypes as antigen-specific features for biomarker development. To enable further applications, we developed an open-source software package, tcrdist3, that implements this framework and facilitates flexible workflows for distance-based TCR repertoire analysis.


Assuntos
Antígenos Virais/genética , COVID-19/imunologia , Antígenos HLA/genética , Receptores de Antígenos de Linfócitos T/genética , SARS-CoV-2/imunologia , Antígenos Virais/imunologia , Biomarcadores , COVID-19/genética , Regiões Determinantes de Complementaridade/imunologia , Biologia Computacional/métodos , Epitopos/genética , Epitopos/imunologia , Genótipo , Antígenos HLA/imunologia , Humanos , Receptores de Antígenos de Linfócitos T/imunologia
18.
Commun Biol ; 4(1): 1240, 2021 10 29.
Artigo em Inglês | MEDLINE | ID: mdl-34716407

RESUMO

Circular tandem repeat proteins ('cTRPs') are de novo designed protein scaffolds (in this and prior studies, based on antiparallel two-helix bundles) that contain repeated protein sequences and structural motifs and form closed circular structures. They can display significant stability and solubility, a wide range of sizes, and are useful as protein display particles for biotechnology applications. However, cTRPs also demonstrate inefficient self-assembly from smaller subunits. In this study, we describe a new generation of cTRPs, with longer repeats and increased interaction surfaces, which enhanced the self-assembly of two significantly different sizes of homotrimeric constructs. Finally, we demonstrated functionalization of these constructs with (1) a hexameric array of peptide-binding SH2 domains, and (2) a trimeric array of anti-SARS CoV-2 VHH domains. The latter proved capable of sub-nanomolar binding affinities towards the viral receptor binding domain and potent viral neutralization function.


Assuntos
Enzima de Conversão de Angiotensina 2/metabolismo , COVID-19/metabolismo , Engenharia de Proteínas/métodos , Proteínas/química , Proteínas/metabolismo , SARS-CoV-2/metabolismo , Sequências de Repetição em Tandem , Sequência de Aminoácidos , COVID-19/virologia , Simulação por Computador , Cristalização , Células HEK293 , Humanos , Modelos Moleculares , Testes de Neutralização , Ligação Proteica , Domínios Proteicos , Dobramento de Proteína , Estrutura Secundária de Proteína , Glicoproteína da Espícula de Coronavírus/química , Glicoproteína da Espícula de Coronavírus/metabolismo
19.
bioRxiv ; 2021 Mar 19.
Artigo em Inglês | MEDLINE | ID: mdl-33398288

RESUMO

As the mechanistic basis of adaptive cellular antigen recognition, T cell receptors (TCRs) encode clinically valuable information that reflects prior antigen exposure and potential future response. However, despite advances in deep repertoire sequencing, enormous TCR diversity complicates the use of TCR clonotypes as clinical biomarkers. We propose a new framework that leverages antigen-enriched repertoires to form meta-clonotypes - groups of biochemically similar TCRs - that can be used to robustly identify and quantify functionally similar TCRs in bulk repertoires. We apply the framework to TCR data from COVID-19 patients, generating 1831 public TCR meta-clonotypes from the 17 SARS-CoV-2 antigen-enriched repertoires with the strongest evidence of HLA-restriction. Applied to independent cohorts, meta-clonotypes targeting these specific epitopes were more frequently detected in bulk repertoires compared to exact amino acid matches, and 59.7% (1093/1831) were more abundant among COVID-19 patients that expressed the putative restricting HLA allele (FDR < 0.01), demonstrating the potential utility of meta-clonotypes as antigen-specific features for biomarker development. To enable further applications, we developed an open-source software package, tcrdist3, that implements this framework and facilitates flexible workflows for distance-based TCR repertoire analysis.

20.
PLoS Comput Biol ; 16(5): e1007507, 2020 05.
Artigo em Inglês | MEDLINE | ID: mdl-32365137

RESUMO

Many scientific disciplines rely on computational methods for data analysis, model generation, and prediction. Implementing these methods is often accomplished by researchers with domain expertise but without formal training in software engineering or computer science. This arrangement has led to underappreciation of sustainability and maintainability of scientific software tools developed in academic environments. Some software tools have avoided this fate, including the scientific library Rosetta. We use this software and its community as a case study to show how modern software development can be accomplished successfully, irrespective of subject area. Rosetta is one of the largest software suites for macromolecular modeling, with 3.1 million lines of code and many state-of-the-art applications. Since the mid 1990s, the software has been developed collaboratively by the RosettaCommons, a community of academics from over 60 institutions worldwide with diverse backgrounds including chemistry, biology, physiology, physics, engineering, mathematics, and computer science. Developing this software suite has provided us with more than two decades of experience in how to effectively develop advanced scientific software in a global community with hundreds of contributors. Here we illustrate the functioning of this development community by addressing technical aspects (like version control, testing, and maintenance), community-building strategies, diversity efforts, software dissemination, and user support. We demonstrate how modern computational research can thrive in a distributed collaborative community. The practices described here are independent of subject area and can be readily adopted by other software development communities.


Assuntos
Biologia Computacional/métodos , Pesquisa/tendências , Software/tendências , Comportamento Cooperativo , Análise de Dados , Engenharia , Biblioteca Gênica , Humanos , Modelos Moleculares , Pesquisadores , Comportamento Social , Interface Usuário-Computador
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA