My Opinion on RL
By umjunsik132 · 2026-06-22 · 2 points · 0 comments
I think RL as a method which produces training data by model's predictions — It directly leads the model to extend its output range because of increased diversity of the data. However, fundamentally RL relies on bootstrapping and has moving target problem which are the reason of…
Open the full discussion on BetterNews