RL-based Reasoning
SWE-RL: Advancing LLM Reasoning via Reinforcement Learning on Open Software Evolution [paper]
FAIR at Meta, UIUC, GenAI at Meta, CMU
Memento: Fine-tuning LLM Agents without Fine-tuning LLMs
Time: 25 Aug 2025
Use external case memory to save past experience, learn to read and write cases while keeping LLM fixed
soft Q-learning framework to select the most relevant cases for each situation through experience
Note:
See The State of Reinforcement Learning for LLM Reasoning for a nice overview
Last updated