Oct 15, 2023 Understanding Phenomenal REINFORCE Policy Gradient Method Jul 28, 2023 Reinforcement Learning from Human Feedback (RLHF) Presentation Slides