Implementation of Reinforcement Learning with Human Feedback for text summarization task using CarperAI’s trlX framework.

Source: Implementing RLHF: Learning to Summarize with trlX - Weights & Biases