Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 49
Filtrar
Más filtros

Intervalo de año de publicación
1.
J Proteome Res ; 23(2): 574-584, 2024 02 02.
Artículo en Inglés | MEDLINE | ID: mdl-38157563

RESUMEN

Accurate and comprehensive peptide precursor ions are crucial to tandem mass-spectrometry-based peptide identification. An identification engine can derive great advantages from the search space reduction enabled by credible and detailed precursors. Furthermore, by considering multiple precursors per spectrum, both the number of identifications and the spectrum explainability can be substantially improved. Here, we introduce PepPre, which detects precursors by decomposing peaks into multiple isotope clusters using linear programming methods. The detected precursors are scored and ranked, and the high-scoring ones are used for subsequent peptide identification. PepPre is evaluated both on regular and cross-linked peptide data sets and compared with 11 methods. The experimental results show that PepPre achieves a remarkable increase of 203% in PSM and 68% in peptide identifications compared to instrument software for regular peptides and 99% in PSM and 27% in peptide pair identifications for cross-linked peptides, surpassing the performance of all other evaluated methods. In addition to the increased identification numbers, further credibility evaluations evidence the reliability of the identified results. Moreover, by widening the isolation window of data acquisition from 2 to 8 Th, with PepPre, an engine is able to identify at least 64% more PSMs, thereby demonstrating the potential advantages of wide-window data acquisition. PepPre is open-source and available at http://peppre.ctarn.io.


Asunto(s)
Péptidos , Proteómica , Reproducibilidad de los Resultados , Proteómica/métodos , Programas Informáticos , Espectrometría de Masas en Tándem/métodos , Bases de Datos de Proteínas , Algoritmos
2.
Environ Sci Technol ; 58(21): 9175-9186, 2024 May 28.
Artículo en Inglés | MEDLINE | ID: mdl-38743611

RESUMEN

We include biodiversity impacts in forest management decision making by incorporating the countryside species area relationship model into the partial equilibrium model GLOBIOM-Forest. We tested three forest management intensities (low, medium, and high) and limited biodiversity loss via an additional constraint on regional species loss. We analyzed two scenarios for climate change mitigation. RCP1.9, the higher mitigation scenario, has more biodiversity loss than the reference RCP7.0, suggesting a trade-off between climate change mitigation, with increased bioenergy use, and biodiversity conservation in forests. This trade-off can be alleviated with biodiversity-conscious forest management by (1) shifting biomass production destined to bioenergy from forests to energy crops, (2) increasing areas under unmanaged secondary forest, (3) reducing forest management intensity, and (4) reallocating biomass production between and within regions. With these mechanisms, it is possible to reduce potential global biodiversity loss by 10% with minor changes in economic outcomes. The global aggregated reduction in biodiversity impacts does not imply that biodiversity impacts are reduced in each ecoregion. We exemplify how to connect an ecologic and an economic model to identify trade-offs, challenges, and possibilities for improved decisions. We acknowledge the limitations of this approach, especially of measuring and projecting biodiversity loss.


Asunto(s)
Biodiversidad , Cambio Climático , Conservación de los Recursos Naturales , Bosques , Biomasa
3.
J Environ Manage ; 366: 121914, 2024 Aug.
Artículo en Inglés | MEDLINE | ID: mdl-39043090

RESUMEN

Food Supply Chains (FSCs) have become increasingly complex with the average distance between producers and consumers rising considerably in the past two decades. Consequently, FSCs are a major source of carbon emissions and reducing transportation costs a major challenge for businesses. To address this, we present a mathematical model to promote the three core dimensions of sustainability (economic, environmental, and social), based on the Mixed-Integer Linear Programming (MILP) method. The model addresses the environmental dimension by intending to decrease the carbon emissions of different transport modes involved in the logistics network. Several supply chain network characteristics are incorporated and evaluated, with a consideration of social sustainability (job generation from operating various facilities). The mathematical model's robustness is demonstrated by testing and deploying it to a variety of problem instances. A real-life case study (Norwegian salmon supply chain) helps to comprehend the model's applicability. To understand the importance of optimizing food supply networks holistically, the paper investigates the impact of multiple supply chain permutations on total cost, demand fluctuations and carbon emissions. To address fluctuations in retail demand, we undertook sensitivity analysis for variations in demand, enabling the proposed model to revamp Norway's salmon supply chain network. Subsequently, the results are thoroughly examined to identify managerial implications.


Asunto(s)
Abastecimiento de Alimentos , Salmón , Animales , Noruega , Modelos Teóricos , Conservación de los Recursos Naturales
4.
Entropy (Basel) ; 26(5)2024 May 08.
Artículo en Inglés | MEDLINE | ID: mdl-38785656

RESUMEN

This paper studies the problem of minimizing the total cost, including computation cost and communication cost, in the system of two-sided secure distributed matrix multiplication (SDMM) under an arbitrary collusion pattern. In order to perform SDMM, the two input matrices are split into some blocks, blocks of random matrices are appended to protect the security of the two input matrices, and encoded copies of the blocks are distributed to all computing nodes for matrix multiplication calculation. Our aim is to minimize the total cost, overall matrix splitting factors, number of appended random matrices, and distribution vector, while satisfying the security constraint of the two input matrices, the decodability constraint of the desired result of the multiplication, the storage capacity of the computing nodes, and the delay constraint. First, a strategy of appending zeros to the input matrices is proposed to overcome the divisibility problem of matrix splitting. Next, the optimization problem is divided into two subproblems with the aid of alternating optimization (AO), where a feasible solution can be obtained. In addition, some necessary conditions for the problem to be feasible are provided. Simulation results demonstrate the superiority of our proposed scheme compared to the scheme without appending zeros and the scheme with no alternating optimization.

5.
Poult Sci ; 103(6): 103636, 2024 Jun.
Artículo en Inglés | MEDLINE | ID: mdl-38547672

RESUMEN

A Microsoft Excel workbook, User-Friendly Feed Formulation with Data from Australia (UffdAu.xlsm), has been developed for teaching feed formulation techniques to tertiary level, university students. It runs under both Microsoft Windows and Apple iOS operating systems. The example ingredient composition matrix is based on the Australian Feed Ingredient Database to illustrate the biological and econometric principles of least-cost feed formulation. The nutrient data are based roughly on recent primary breeder company recommendations. The workbook is easily adapted to appropriate ingredients, nutrients, and prices most relevant to the students, wherever it is used. The workbook uses the linear routines of Excel's Solver add-in under the Data heading in the header Ribbon. There is a worksheet illustrating how to adapt non-linear responses such as exogenous enzymes to typical linear models using a step function. Additional worksheets illustrate how proximate analysis can be interpreted in modern analytical chemistry terms and, how various feed energy measures are related to feed composition. UffdAu.xlsm is available free of charge from the Poultry Hub Australia website (https://www.poultryhub.org).


Asunto(s)
Alimentación Animal , Aves de Corral , Programas Informáticos , Alimentación Animal/análisis , Animales , Crianza de Animales Domésticos/métodos , Australia
6.
Algorithms Mol Biol ; 19(1): 17, 2024 Apr 29.
Artículo en Inglés | MEDLINE | ID: mdl-38679703

RESUMEN

The graph traversal edit distance (GTED), introduced by Ebrahimpour Boroojeny et al. (2018), is an elegant distance measure defined as the minimum edit distance between strings reconstructed from Eulerian trails in two edge-labeled graphs. GTED can be used to infer evolutionary relationships between species by comparing de Bruijn graphs directly without the computationally costly and error-prone process of genome assembly. Ebrahimpour Boroojeny et al. (2018) propose two ILP formulations for GTED and claim that GTED is polynomially solvable because the linear programming relaxation of one of the ILPs always yields optimal integer solutions. The claim that GTED is polynomially solvable is contradictory to the complexity results of existing string-to-graph matching problems. We resolve this conflict in complexity results by proving that GTED is NP-complete and showing that the ILPs proposed by Ebrahimpour Boroojeny et al. do not solve GTED but instead solve for a lower bound of GTED and are not solvable in polynomial time. In addition, we provide the first two, correct ILP formulations of GTED and evaluate their empirical efficiency. These results provide solid algorithmic foundations for comparing genome graphs and point to the direction of heuristics. The source code to reproduce experimental results is available at https://github.com/Kingsford-Group/gtednewilp/ .

7.
Heliyon ; 10(11): e31820, 2024 Jun 15.
Artículo en Inglés | MEDLINE | ID: mdl-38845896

RESUMEN

An integrated operations planning model for automotive wiring companies is studied to improve synchronization between production activities and inventory flows. These combined factors are growing in significance as they drive the need to take proactive steps in manufacturing and distributing wiring materials within the supply chain. This involves anticipating the requirements of different automotive manufacturers and thereby guaranteeing a consistent, uninterrupted, and punctual provision of raw wiring materials. This support is vital for sustaining the ongoing manufacturing operations in the automotive sector. For this push flow system, the proposed operational model is based on integer linear programming, considering capacity and bill of materials constraints to determine production quantities, inventory levels, and machine sizing. Real-life data from the automotive wiring industry validates the effectiveness of coordinated production and inventory activities, resulting in significant lead time reductions of up to 60 %. These findings provide compelling reasons for automotive wiring partners to engage in joint operations planning.

8.
J Comput Biol ; 31(5): 416-428, 2024 05.
Artículo en Inglés | MEDLINE | ID: mdl-38687334

RESUMEN

A Coding DNA Sequence (CDS) is a fraction of DNA whose nucleotides are grouped into consecutive triplets called codons, each one encoding an amino acid. Because most amino acids can be encoded by more than one codon, the same amino acid chain can be obtained by a very large number of different CDSs. These synonymous CDSs show different features that, also depending on the organism the transcript is expressed in, could affect translational efficiency and yield. The identification of optimal CDSs with respect to given transcript indicators is in general a challenging task, but it has been observed in recent literature that integer linear programming (ILP) can be a very flexible and efficient way to achieve it. In this article, we add evidence to this observation by proposing a new ILP model that simultaneously optimizes different well-grounded indicators. With this model, we efficiently find solutions that dominate those returned by six existing codon optimization heuristics.


Asunto(s)
Algoritmos , Codón , Modelos Genéticos , Programación Lineal , Codón/genética , Secuencia de Bases/genética , ADN/genética , Biología Computacional/métodos
9.
J Bodyw Mov Ther ; 39: 496-504, 2024 Jul.
Artículo en Inglés | MEDLINE | ID: mdl-38876674

RESUMEN

The purpose of this study was to analyze the effect of two different programming models of resistance training (RT) on metabolic risk, anthropometric variables, and relative strength in elderly women. The research was a prospective and comparative longitudinal study with a non-probabilistic random sample. Twenty-two elderly women (64 ± 3 years) was divided into two experimental groups being the Linear programming (LP, n = 12) and Daily undulatory programming (DUP, n = 10), with 3 sessions/week for 12 weeks. Submaximal strength (10RM) was evaluated in the horizontal leg press (HL), pulldown (PD), leg curl (LC), vertical bench press (BP), and leg extension (LE). Anthropometric variables, food intake (R24h) and submaximal strength (10RM) was analyzed. Participants were initially classified as overweight or obese evaluated by body mass index (BMI) and percentual of fat mass (%FM) and with moderate to high risk to develop metabolic diseases evaluated by hip-waist ratio (HWR), waist-height ratio (WHR) and waist circumference (WC). There is no change for metabolic risk and anthropometric variables after the intervention period. There was a significant improvement for relative strength accessed by 10RM and body weight (10RM/BW), and lean body mass (10RM/LBM) (p < 0.05), with large or medium effect size for most of variables after 12 weeks of RT. As a conclusion, both programmings increased relative strength after 12 weeks of RT with attenuated change in body composition and metabolic risk in elderly women in both programming groups and all those strategies can be used in elderly women to improve strength.


Asunto(s)
Composición Corporal , Fuerza Muscular , Entrenamiento de Fuerza , Humanos , Entrenamiento de Fuerza/métodos , Femenino , Anciano , Fuerza Muscular/fisiología , Composición Corporal/fisiología , Persona de Mediana Edad , Estudios Prospectivos , Estudios Longitudinales , Índice de Masa Corporal , Antropometría , Circunferencia de la Cintura/fisiología
10.
INFORMS J Comput ; 36(2): 434-455, 2024.
Artículo en Inglés | MEDLINE | ID: mdl-38883557

RESUMEN

Chemotherapy drug administration is a complex problem that often requires expensive clinical trials to evaluate potential regimens; one way to alleviate this burden and better inform future trials is to build reliable models for drug administration. This paper presents a mixed-integer program for combination chemotherapy (utilization of multiple drugs) optimization that incorporates various important operational constraints and, besides dose and concentration limits, controls treatment toxicity based on its effect on the count of white blood cells. To address the uncertainty of tumor heterogeneity, we also propose chance constraints that guarantee reaching an operable tumor size with a high probability in a neoadjuvant setting. We present analytical results pertinent to the accuracy of the model in representing biological processes of chemotherapy and establish its potential for clinical applications through a numerical study of breast cancer.

11.
Water Environ Res ; 96(5): e11031, 2024 May.
Artículo en Inglés | MEDLINE | ID: mdl-38685725

RESUMEN

The pollutant transport equilibrium in a watershed can be analyzed on a large time scale, and land-use export coefficients can be calculated directly under certain hydrologic and transport conditions, by ignoring hydrologic and transport processes at small space and time scales on hydrologic response units. In this study, the water environment system of a watershed was deconstructed into three parts (source, source-sink, and runoff transport) to construct a pollutant transportation equilibrium model on a large time scale. A watershed with an annual source-sink accumulation of zero was defined as a completely transported watershed; therefore, we derived a completely transported equilibrium equation. The problem of seeking the land export coefficient was converted into a problem of seeking the optimal solution of linear programming, which can be estimated according to the variation in pollutant output processes. The feasibility of the solution can be analyzed using multi-year stochastic rainfall processes. The model was used to analyze the transport equilibrium of chemical oxygen demand (COD), total nitrogen (TN), and total phosphorus (TP) upstream of the monitored cross-sections in a watershed, which covered 3145.66 km2. The land export coefficients were calculated according to the model. The model calculations indicated that the watershed was completely transported during perennial years. The calculated export coefficients of COD, TN, and TP for farmland, primary vegetation, and urban land were within the range of general empirical values. The calculated maximum accumulations of COD, TN, and TP were 0.19 × 107, 0.063 × 107, and 0.049 × 106 kg, respectively, for perennial rainfall. PRACTITIONER POINTS: A completely transported watershed was defined, and a model of pollutant transportation equilibrium with large time-scale was constructed. A problem of seeking the optimal solution of a linear programming was designed to estimate the land export coefficient of COD, TN, and TP. The runoff transport and accumulation processes of COD, TN, and TP in a watershed was analyzed.


Asunto(s)
Modelos Teóricos , Movimientos del Agua , Contaminantes Químicos del Agua , Contaminantes Químicos del Agua/química , Fósforo/química , Nitrógeno/química , Monitoreo del Ambiente , Análisis de la Demanda Biológica de Oxígeno
12.
Biosystems ; 237: 105163, 2024 Mar.
Artículo en Inglés | MEDLINE | ID: mdl-38401640

RESUMEN

In this paper, we explore the challenges associated with biomarker identification for diagnosis purpose in biomedical experiments, and propose a novel approach to handle the above challenging scenario via the generalization of the Dantzig selector. To improve the efficiency of the regularization method, we introduce a transformation from an inherent nonlinear programming due to its nonlinear link function into a linear programming framework under a reasonable assumption on the logistic probability range. We illustrate the use of our method on an experiment with binary response, showing superior performance on biomarker identification studies when compared to their conventional analysis. Our proposed method does not merely serve as a variable/biomarker selection tool, its ranking of variable importance provides valuable reference information for practitioners to reach informed decisions regarding the prioritization of factors for further investigations.


Asunto(s)
Biomarcadores , Probabilidad
13.
Bioresour Technol ; 398: 130523, 2024 Apr.
Artículo en Inglés | MEDLINE | ID: mdl-38437962

RESUMEN

This work presents dynamic optimization strategies of batch hydrothermal liquefaction of two microalgal species, Aurantiochytrium sp. KRS101 and Nannochloropsis sp. to optimize the reactor temperature profiles. Three dynamic optimization problems are solved to maximize the endpoint biocrude yield, minimize the final time, and minimize the reactor thermal energy. The biocrude maximization and time minimization problems demonstrated 11% and 6.18% increment in the optimal biocrude yields and reduction of 78.2% and 61.66% in batch times compared to the base cases for the microalgae with higher lipid and protein fractions, respectively. The energy minimization problem revealed a significant reduction in the reactor thermal energies to generate the targeted biocrude yields compared to the biocrude maximization. Therefore, the identified optimal temperature trajectories outperformed the conventional fixed temperature profiles and could improve the overall economics of the batch bio-oil production from the algal-based biorefineries by significantly enhancing the reactor performance.


Asunto(s)
Microalgas , Aceites de Plantas , Polifenoles , Microalgas/metabolismo , Agua/metabolismo , Biomasa , Temperatura
14.
Genome Biol ; 25(1): 170, 2024 07 01.
Artículo en Inglés | MEDLINE | ID: mdl-38951884

RESUMEN

Microbial pangenome analysis identifies present or absent genes in prokaryotic genomes. However, current tools are limited when analyzing species with higher sequence diversity or higher taxonomic orders such as genera or families. The Roary ILP Bacterial core Annotation Pipeline (RIBAP) uses an integer linear programming approach to refine gene clusters predicted by Roary for identifying core genes. RIBAP successfully handles the complexity and diversity of Chlamydia, Klebsiella, Brucella, and Enterococcus genomes, outperforming other established and recent pangenome tools for identifying all-encompassing core genes at the genus level. RIBAP is a freely available Nextflow pipeline at github.com/hoelzer-lab/ribap and zenodo.org/doi/10.5281/zenodo.10890871.


Asunto(s)
Genoma Bacteriano , Anotación de Secuencia Molecular , Programas Informáticos , Brucella/genética , Brucella/clasificación , Bacterias/genética , Bacterias/clasificación , Chlamydia/genética , Enterococcus/genética , Klebsiella/genética
15.
J Comput Biol ; 31(4): 294-311, 2024 04.
Artículo en Inglés | MEDLINE | ID: mdl-38621180

RESUMEN

Whole Genome Duplications (WGDs) are events that double the content and structure of a genome. In some organisms, multiple WGD events have been observed while loss of genetic material is a typical occurrence following a WGD event. The requirement of classic rearrangement models that every genetic marker has to occur exactly two times in a given problem instance, therefore, poses a serious restriction in this context. The Double-Cut and Join (DCJ) model is a simple and powerful model for the analysis of large structural rearrangements. After being extended to the DCJ-Indel model, capable of handling gains and losses of genetic material, research has shifted in recent years toward enabling it to handle natural genomes, for which no assumption about the distribution of markers has to be made. The traditional theoretical framework for studying WGD events is the Genome Halving Problem (GHP). While the GHP is solved for the DCJ model for genomes without losses, there are currently no exact algorithms utilizing the DCJ-Indel model that are able to handle natural genomes. In this work, we present a general view on the DCJ-Indel model that we apply to derive an exact polynomial time and space solution for the GHP on genomes with at most two genes per family before generalizing the problem to an integer linear program solution for natural genomes.


Asunto(s)
Algoritmos , Genoma , Modelos Genéticos , Genoma/genética , Duplicación de Gen , Evolución Molecular
16.
Heliyon ; 10(10): e31297, 2024 May 30.
Artículo en Inglés | MEDLINE | ID: mdl-38818174

RESUMEN

The current best-known performance guarantees for the extensively studied Traveling Salesman Problem (TSP) of determinate approximation algorithms is 32, achieved by Christofides' algorithm 47 years ago. This paper investigates a new generalization problem of the TSP, termed the Minimum-Cost Bounded Degree Connected Subgraph (MBDCS) problem. In the MBDCS problem, the goal is to identify a minimum-cost connected subgraph containing n=|V| edges from an input graph G=(V,E) with degree upper bounds for particular vertices. We show that for certain special cases of MBDCS, the aim is equivalent to finding a minimum-cost Hamiltonian cycle for the input graph, same as the TSP. To appropriately solve MBDCS, we initially present an integer programming formulation for the problem. Subsequently, we propose an algorithm to approximate the optimal solution by applying the iterative rounding technique to solution of the integer programming relaxation. We demonstrate that the returned subgraph of our proposed algorithm is one of the best guarantees for the MBDCS problem in polynomial time, assuming P≠NP. This study views the optimization of TSP as finding a minimum-cost connected subgraph containing n edges with degree upper bounds for certain vertices, and it may provide new insights into optimizing the TSP in future research.

17.
Algorithms Mol Biol ; 19(1): 5, 2024 Feb 06.
Artículo en Inglés | MEDLINE | ID: mdl-38321522

RESUMEN

BACKGROUND: Scaffolding is an intermediate stage of fragment assembly. It consists in orienting and ordering the contigs obtained by the assembly of the sequencing reads. In the general case, the problem has been largely studied with the use of distances data between the contigs. Here we focus on a dedicated scaffolding for the chloroplast genomes. As these genomes are small, circular and with few specific repeats, numerous approaches have been proposed to assemble them. However, their specificities have not been sufficiently exploited. RESULTS: We give a new formulation for the scaffolding in the case of chloroplast genomes as a discrete optimisation problem, that we prove the decision version to be [Formula: see text]-Complete. We take advantage of the knowledge of chloroplast genomes and succeed in expressing the relationships between a few specific genomic repeats in mathematical constraints. Our approach is independent of the distances and adopts a genomic regions view, with the priority on scaffolding the repeats first. In this way, we encode the structural haplotype issue in order to retrieve several genome forms that coexist in the same chloroplast cell. To solve exactly the optimisation problem, we develop an integer linear program that we implement in Python3 package khloraascaf. We test it on synthetic data to investigate its performance behaviour and its robustness against several chosen difficulties. CONCLUSIONS: We succeed to model biological knowledge on genomic structures to scaffold chloroplast genomes. Our results suggest that modelling genomic regions is sufficient for scaffolding repeats and is suitable for finding several solutions corresponding to several genome forms.

18.
Front Hum Neurosci ; 18: 1201574, 2024.
Artículo en Inglés | MEDLINE | ID: mdl-38487104

RESUMEN

Introduction: This study focuses on broadening the applicability of the metaheuristic L1-norm fitted and penalized (L1L1) optimization method in finding a current pattern for multichannel transcranial electrical stimulation (tES). The metaheuristic L1L1 optimization framework defines the tES montage via linear programming by maximizing or minimizing an objective function with respect to a pair of hyperparameters. Methods: In this study, we explore the computational performance and reliability of different optimization packages, algorithms, and search methods in combination with the L1L1 method. The solvers from Matlab R2020b, MOSEK 9.0, Gurobi Optimizer, CVX's SeDuMi 1.3.5, and SDPT3 4.0 were employed to produce feasible results through different linear programming techniques, including Interior-Point (IP), Primal-Simplex (PS), and Dual-Simplex (DS) methods. To solve the metaheuristic optimization task of L1L1, we implement an exhaustive and recursive search along with a well-known heuristic direct search as a reference algorithm. Results: Based on our results, and the given optimization task, Gurobi's IP was, overall, the preferable choice among Interior-Point while MOSEK's PS and DS packages were in the case of Simplex methods. These methods provided substantial computational time efficiency for solving the L1L1 method regardless of the applied search method. Discussion: While the best-performing solvers show that the L1L1 method is suitable for maximizing either focality and intensity, a few of these solvers could not find a bipolar configuration. Part of the discrepancies between these methods can be explained by a different sensitivity with respect to parameter variation or the resolution of the lattice provided.

19.
Math Biosci Eng ; 21(3): 3668-3694, 2024 Feb 18.
Artículo en Inglés | MEDLINE | ID: mdl-38549301

RESUMEN

Automatic test paper design is critical in education to reduce workloads for educators and facilitate an efficient teaching process. However, current designs fail to satisfy the realistic teaching requirements of educators, including the consideration of both test quality and efficiency. This is the main reason why teachers still manually construct tests in most teaching environments. In this paper, the quality of tests is quantitatively defined while considering multiple objectives, including a flexible coverage of knowledge points, cognitive levels, and question difficulty. Then, a model based on the technique of linear programming is delicately designed to explore the optimal results for this newly defined problem. However, this technique is not efficient enough, which cannot obtain results in polynomial time. With the consideration of both test quality and generation efficiency, this paper proposes a genetic algorithm (GA) based method, named dynamic programming guided genetic algorithm with adaptive selection (DPGA-AS). In this method, a dynamic programming method is proposed in the population initialization part to improve the efficiency of the genetic algorithm. An adaptive selection method for the GA is designed to avoid prematurely falling into the local optimal for better test quality. The question bank used in our experiments is assembled based on college-level calculus questions from well-known textbooks. The experimental results show that the proposed techniques can construct test papers with both high effectiveness and efficiency. The computation time of the test assembly problem is reduced from 3 hours to 2 seconds for a 5000-size question bank as compared to a linear programming model with similar test quality. The test quality of the proposed method is better than the other baselines.

20.
Algorithms Mol Biol ; 19(1): 8, 2024 Feb 27.
Artículo en Inglés | MEDLINE | ID: mdl-38414060

RESUMEN

One of the most fundamental problems in genome rearrangement studies is the (genomic) distance problem. It is typically formulated as finding the minimum number of rearrangements under a model that are needed to transform one genome into the other. A powerful multi-chromosomal model is the Double Cut and Join (DCJ) model.While the DCJ model is not able to deal with some situations that occur in practice, like duplicated or lost regions, it was extended over time to handle these cases. First, it was extended to the DCJ-indel model, solving the issue of lost markers. Later ILP-solutions for so called natural genomes, in which each genomic region may occur an arbitrary number of times, were developed, enabling in theory to solve the distance problem for any pair of genomes. However, some theoretical and practical issues remained unsolved. On the theoretical side of things, there exist two disparate views of the DCJ-indel model, motivated in the same way, but with different conceptualizations that could not be reconciled so far. On the practical side, while ILP solutions for natural genomes typically perform well on telomere to telomere resolved genomes, they have been shown in recent years to quickly loose performance on genomes with a large number of contigs or linear chromosomes. This has been linked to a particular technique, namely capping. Simply put, capping circularizes linear chromosomes by concatenating them during solving time, increasing the solution space of the ILP superexponentially. Recently, we introduced a new conceptualization of the DCJ-indel model within the context of another rearrangement problem. In this manuscript, we will apply this new conceptualization to the distance problem. In doing this, we uncover the relation between the disparate conceptualizations of the DCJ-indel model. We are also able to derive an ILP solution to the distance problem that does not rely on capping. This solution significantly improves upon the performance of previous solutions on genomes with high numbers of contigs while still solving the problem exactly and being competitive in performance otherwise. We demonstrate the performance advantage on simulated genomes as well as showing its practical usefulness in an analysis of 11 Drosophila genomes.

SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA