Reinforcement Learning from Human Feedback

41 points2 comments3 hours ago
klelatti

Web version with links, etc:

https://rlhfbook.com/

show comments