RL-based Reasoning

  • SWE-RL: Advancing LLM Reasoning via Reinforcement Learning on Open Software Evolution [paperarrow-up-right]

    • FAIR at Meta, UIUC, GenAI at Meta, CMU

  • Memento: Fine-tuning LLM Agents without Fine-tuning LLMs

    • Time: 25 Aug 2025

      1. Use external case memory to save past experience, learn to read and write cases while keeping LLM fixed

      2. soft Q-learning framework to select the most relevant cases for each situation through experience

Note:

See The State of Reinforcement Learning for LLM Reasoningarrow-up-right for a nice overview

Last updated