top of page

Reinforcement Learning from Human Feedback

RLHF has been the driving force behind the recent released ChatGPT. However, the concept came about a few years back in 2017 through a paper by DeepMind & OpenAI researchers.

bottom of page