1 min readReinforcement Learning from Human FeedbackRLHF has been the driving force behind the recent released ChatGPT. However, the concept came about a few years back in 2017 through a paper by DeepMind & OpenAI researchers.
RLHF has been the driving force behind the recent released ChatGPT. However, the concept came about a few years back in 2017 through a paper by DeepMind & OpenAI researchers.