Exploration Hacking: Can LLMs Learn to Resist RL Training?

(alignmentforum.org)

2 points | by Prof_Sigmund 8 hours ago ago

No comments yet.