Articles
14
Tags
12
Categories
6
Home
Archives
Tags
Categories
Magnicord
Home
Archives
Tags
Categories
March 2025
All Articles - 2
2025
2025-03-05
An Introduction to Policy Optimization
2025-03-05
Key Concepts in Reinforcement Learning
Magnicord
Re: Deep Learning From Scratch
Articles
14
Tags
12
Categories
6
Follow Me
Recent Posts
UV: The Definitive Solution for PyTorch, flash-attn, VeRL and OpenRLHF
2025-08-10
Policy Gradient Algorithms: From REINFORCE to PPO
2025-07-15
The Evolution of Policy Optimization for Enhancing LLM Reasoning: From PPO to GRPO Variants
2025-07-15
An Introduction to PPO in RLHF
2025-07-15
X-Enhanced Contrastive Decoding Strategies for Large Language Models
2025-05-30
Categories
Deep Learning Basics
1
NLP
5
Python Basics
2
RL4LLM
2
Reinforcement Learning Basics
3
dev-tools
1
Tags
data-structure
Pytorch
LLM
uv
reasoning
environment
RLHF
reinforcement-learning
PEFT
deep-learning
Python
NLP
Archives
August 2025
1
July 2025
3
May 2025
1
March 2025
2
January 2025
4
January 2024
1
October 2023
2
Website Info
Article Count :
14
Total Word Count :
71.1k
Unique Visitors :
Page Views :
Last Update :
繁