13log
  • HOME
  • ARCHIVES
  • TOPICS
    Publications Open Source Notes Share Tech Share Life Share
  • ABOUT
Tags - PPO

2 posts in total


2026

01-09
Video-MTR——基于RL的长视频多轮推理框架

2025

12-15
PPO——近端策略优化算法
© 2025 13 Lab. All Rights Reserved.
Views: Visitors: