RLHF

ThoughtStorms Wiki

Context : ArtificialIntelligence, MachineLearning, LanguageModels

Reinforcement Learning from Human Feedback

Most AIs are trained on large amounts of human-data. But then human feedback is used to improve / overcome the issues in that data.

https://kili-technology.com/large-language-models-llms/exploring-reinforcement-learning-from-human-feedback-rlhf-a-comprehensive-guide

Backlinks (1 items)