Pesquisa | BVS IEC

1.

Solving olympiad geometry without human demonstrations.

Trinh, Trieu H; Wu, Yuhuai; Le, Quoc V; He, He; Luong, Thang.

Nature ; 625(7995): 476-482, 2024 Jan.

Artigo em Inglês | MEDLINE | ID: mdl-38233616

RESUMO

Proving mathematical theorems at the olympiad level represents a notable milestone in human-level automated reasoning1-4, owing to their reputed difficulty among the world's best talents in pre-university mathematics. Current machine-learning approaches, however, are not applicable to most mathematical domains owing to the high cost of translating human proofs into machine-verifiable format. The problem is even worse for geometry because of its unique translation challenges1,5, resulting in severe scarcity of training data. We propose AlphaGeometry, a theorem prover for Euclidean plane geometry that sidesteps the need for human demonstrations by synthesizing millions of theorems and proofs across different levels of complexity. AlphaGeometry is a neuro-symbolic system that uses a neural language model, trained from scratch on our large-scale synthetic data, to guide a symbolic deduction engine through infinite branching points in challenging problems. On a test set of 30 latest olympiad-level problems, AlphaGeometry solves 25, outperforming the previous best method that only solves ten problems and approaching the performance of an average International Mathematical Olympiad (IMO) gold medallist. Notably, AlphaGeometry produces human-readable proofs, solves all geometry problems in the IMO 2000 and 2015 under human expert evaluation and discovers a generalized version of a translated IMO theorem in 2004.

Assuntos

Matemática , Processamento de Linguagem Natural , Resolução de Problemas , Humanos , Matemática/métodos , Matemática/normas

2.

Evaluating language models for mathematics through interactions.

Collins, Katherine M; Jiang, Albert Q; Frieder, Simon; Wong, Lionel; Zilka, Miri; Bhatt, Umang; Lukasiewicz, Thomas; Wu, Yuhuai; Tenenbaum, Joshua B; Hart, William; Gowers, Timothy; Li, Wenda; Weller, Adrian; Jamnik, Mateja.

Proc Natl Acad Sci U S A ; 121(24): e2318124121, 2024 Jun 11.

Artigo em Inglês | MEDLINE | ID: mdl-38830100

RESUMO

There is much excitement about the opportunity to harness the power of large language models (LLMs) when building problem-solving assistants. However, the standard methodology of evaluating LLMs relies on static pairs of inputs and outputs; this is insufficient for making an informed decision about which LLMs are best to use in an interactive setting, and how that varies by setting. Static assessment therefore limits how we understand language model capabilities. We introduce CheckMate, an adaptable prototype platform for humans to interact with and evaluate LLMs. We conduct a study with CheckMate to evaluate three language models (InstructGPT, ChatGPT, and GPT-4) as assistants in proving undergraduate-level mathematics, with a mixed cohort of participants from undergraduate students to professors of mathematics. We release the resulting interaction and rating dataset, MathConverse. By analyzing MathConverse, we derive a taxonomy of human query behaviors and uncover that despite a generally positive correlation, there are notable instances of divergence between correctness and perceived helpfulness in LLM generations, among other findings. Further, we garner a more granular understanding of GPT-4 mathematical problem-solving through a series of case studies, contributed by experienced mathematicians. We conclude with actionable takeaways for ML practitioners and mathematicians: models that communicate uncertainty, respond well to user corrections, and can provide a concise rationale for their recommendations, may constitute better assistants. Humans should inspect LLM output carefully given their current shortcomings and potential for surprising fallibility.

Assuntos

Idioma , Matemática , Resolução de Problemas , Humanos , Resolução de Problemas/fisiologia , Estudantes/psicologia

3.

Author Correction: Solving olympiad geometry without human demonstrations.

Trinh, Trieu H; Wu, Yuhuai; Le, Quoc V; He, He; Luong, Thang.

Nature ; 627(8004): E8, 2024 Mar.

Artigo em Inglês | MEDLINE | ID: mdl-38396303

4.

Grandmaster level in StarCraft II using multi-agent reinforcement learning.

Vinyals, Oriol; Babuschkin, Igor; Czarnecki, Wojciech M; Mathieu, Michaël; Dudzik, Andrew; Chung, Junyoung; Choi, David H; Powell, Richard; Ewalds, Timo; Georgiev, Petko; Oh, Junhyuk; Horgan, Dan; Kroiss, Manuel; Danihelka, Ivo; Huang, Aja; Sifre, Laurent; Cai, Trevor; Agapiou, John P; Jaderberg, Max; Vezhnevets, Alexander S; Leblond, Rémi; Pohlen, Tobias; Dalibard, Valentin; Budden, David; Sulsky, Yury; Molloy, James; Paine, Tom L; Gulcehre, Caglar; Wang, Ziyu; Pfaff, Tobias; Wu, Yuhuai; Ring, Roman; Yogatama, Dani; Wünsch, Dario; McKinney, Katrina; Smith, Oliver; Schaul, Tom; Lillicrap, Timothy; Kavukcuoglu, Koray; Hassabis, Demis; Apps, Chris; Silver, David.

Nature ; 575(7782): 350-354, 2019 11.

Artigo em Inglês | MEDLINE | ID: mdl-31666705

RESUMO

Many real-world applications require artificial agents to compete and coordinate with other agents in complex environments. As a stepping stone to this goal, the domain of StarCraft has emerged as an important challenge for artificial intelligence research, owing to its iconic and enduring status among the most difficult professional esports and its relevance to the real world in terms of its raw complexity and multi-agent challenges. Over the course of a decade and numerous competitions1-3, the strongest agents have simplified important aspects of the game, utilized superhuman capabilities, or employed hand-crafted sub-systems4. Despite these advantages, no previous agent has come close to matching the overall skill of top StarCraft players. We chose to address the challenge of StarCraft using general-purpose learning methods that are in principle applicable to other complex domains: a multi-agent reinforcement learning algorithm that uses data from both human and agent games within a diverse league of continually adapting strategies and counter-strategies, each represented by deep neural networks5,6. We evaluated our agent, AlphaStar, in the full game of StarCraft II, through a series of online games against human players. AlphaStar was rated at Grandmaster level for all three StarCraft races and above 99.8% of officially ranked human players.

Assuntos

Reforço Psicológico , Jogos de Vídeo , Inteligência Artificial , Humanos , Aprendizagem

5.

Comprehensive Analysis of N6-Methyladenosine RNA Methylation Regulators in the Diagnosis and Subtype Classification of Rheumatoid Arthritis.

Zhang, Shaoxiong; Sun, Shuo; Zhang, Yajuan; Liu, Jianping; Wu, Yuhuai; Zhang, Xiguang.

Biochem Genet ; 2023 Dec 19.

Artigo em Inglês | MEDLINE | ID: mdl-38112894

RESUMO

m6A modification is the most abundant mRNA modifications and plays an integral role in various biological processes in eukaryotes. However, the role of m6A regulators in rheumatoid arthritis remains unknown. To determine the expression of m6A RNA methylation regulators in rheumatoid arthritis and their possible functional and prognostic value. In this study, we performed differential analysis in the comprehensive gene expression database GSE93272 dataset between non-rheumatoid arthritis patients and rheumatoid arthritis patients to obtain 15 important m6A regulators. A random forest model and lasso regression were used to screen the five most important m6A regulators to predict the risk of developing rheumatoid arthritis. After further validation using in vitro qPCR experiments, a nomogram model was developed based on the four most important m6A regulators (ELAVL1, WTAP, YTHDF1, and ALKBH5). Immuno-infiltration analysis and consensus clustering analysis were then performed. An analysis of the decision curve showed that the nomogram model could be beneficial to patients. According to selected important m6A regulators, patients with rheumatoid arthritis were classified into two m6A models (ClusterA and ClusterB) via consensus approach. Activated B cells, CD56dim natural killer cells, immature B cells, monocytes, natural killer T cells, and T lymphocytes were associated with ClusterA in immune infiltration analysis. Importantly, immune infiltration in patients with high ELAVL1 expression was strikingly similar to ClusterA. m6A regulators play a non-negligible role in the development of rheumatoid arthritis. A study of m6A patterns may provide future therapeutic options for rheumatoid arthritis.

6.

RETRACTED ARTICLE: MiR-28-5p Promotes Osteosarcoma Development by Suppressing URGCP Expression.

Zhang, Chuanlin; Wu, Yuhuai; Yue, Qiaoning; Zhang, Xiguang; Hao, Yinglu; Liu, Jianping.

Biochem Genet ; 62(1): 574, 2024 Feb.

Artigo em Inglês | MEDLINE | ID: mdl-36995530

7.

STDP-Compatible Approximation of Backpropagation in an Energy-Based Model.

Bengio, Yoshua; Mesnard, Thomas; Fischer, Asja; Zhang, Saizheng; Wu, Yuhuai.

Neural Comput ; 29(3): 555-577, 2017 03.

Artigo em Inglês | MEDLINE | ID: mdl-28095200

RESUMO

We show that Langevin Markov chain Monte Carlo inference in an energy-based model with latent variables has the property that the early steps of inference, starting from a stationary point, correspond to propagating error gradients into internal layers, similar to backpropagation. The backpropagated error is with respect to output units that have received an outside driving force pushing them away from the stationary point. Backpropagated error gradients correspond to temporal derivatives with respect to the activation of hidden units. These lead to a weight update proportional to the product of the presynaptic firing rate and the temporal rate of change of the postsynaptic firing rate. Simulations and a theoretical argument suggest that this rate-based update rule is consistent with those associated with spike-timing-dependent plasticity. The ideas presented in this article could be an element of a theory for explaining how brains perform credit assignment in deep hierarchies as efficiently as backpropagation does, with neural computation corresponding to both approximate inference in continuous-valued latent variables and error backpropagation, at the same time.

8.

Association of carotid atherosclerosis and recurrent cerebral infarction in the Chinese population: a meta-analysis.

Liu, Jianping; Zhu, Yun; Wu, Yuhuai; Liu, Yan; Teng, Zhaowei; Hao, Yinglu.

Neuropsychiatr Dis Treat ; 13: 527-533, 2017.

Artigo em Inglês | MEDLINE | ID: mdl-28260898

RESUMO

Stroke, when poor blood flow to the brain results in cell death, is the third leading cause of disability and mortality worldwide, and appears as an unequal distribution in the global population. The cumulative risk of recurrence varies greatly up to 10 years after the first stroke. Carotid atherosclerosis is a major risk factor for stroke. The aim of this study was to investigate and estimate the relationship between carotid atherosclerosis and risk of stroke recurrence in the Chinese population. We performed a systematic review and meta-analysis of randomized controlled trials published from 2000 to 2013, using the following databases: PubMed, Embase, Medline, Wanfang, and the China National Knowledge Infrastructure. The odds ratios with 95% confidence intervals were calculated to examine this strength. A total of 22 studies, including 3,912 patients, 2,506 first-ever cases, and 1,406 recurrent cases, were pooled in this meta-analysis. Our results showed that the frequency of carotid atherosclerosis is higher in recurrent cases than that in the first-ever controls (78.88% vs 59.38%), and the statistical analysis demonstrated significant positive association between carotid atherosclerosis and recurrent cerebral infarction (odds ratio: 2.87; 95% confidence interval: 2.42-3.37; P<0.00001) in a fixed-effect model. No significant heterogeneity was observed across all studies. In conclusion, our results showed that carotid atherosclerosis was associated with increased risk of recurrent stroke. However, further well-designed research with large sample sizes is still needed to identify the clear mechanism.

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

RESUMO

RESUMO

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA