Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
ComPO
community
Activity Feed
Follow
4
AI & ML interests
None defined yet.
Recent Activity
PeterLauLukCh
Â
authored
a paper
1 day ago
Exploration v.s. Exploitation: Rethinking RLVR through Clipping, Entropy, and Spurious Reward
PeterLauLukCh
Â
authored
a paper
1 day ago
GenEnv: Difficulty-Aligned Co-Evolution Between LLM Agents and Environment Simulators
PeterLauLukCh
Â
submitted
a paper
8 days ago
Exploration v.s. Exploitation: Rethinking RLVR through Clipping, Entropy, and Spurious Reward
View all activity
Team members
2
ComparisonPO
's models
18
Sort:Â Recently updated
ComparisonPO/Mistral-7B-Instruct-A0.21B-ComPO
7B
•
Updated
Aug 4
•
6
ComparisonPO/Mistral-7B-Instruct-ComPO-3300pert-300iter-2
7B
•
Updated
Jul 21
•
6
ComparisonPO/Mistral-7B-Instruct-ComPO-3300pert-300iter
7B
•
Updated
Jul 13
•
8
ComparisonPO/Gemma-2-9b-it-SimPO-ComPO
9B
•
Updated
Apr 10
•
4
ComparisonPO/Mistral-Instruct-7B-SimPO-ComPO
7B
•
Updated
Apr 7
•
8
ComparisonPO/Llama-3-Instruct-8B-SimPO-ComPO
8B
•
Updated
Apr 5
•
8
ComparisonPO/Llama-3-Base-8B-DPO-ComPO
8B
•
Updated
Mar 26
•
8
ComparisonPO/Llama-3-Instruct-8B-DPO-ComPO
8B
•
Updated
Mar 19
•
7
ComparisonPO/Llama-3-Base-8B-DPO_clean
8B
•
Updated
Mar 17
•
10
ComparisonPO/Mistral-Base-7B-DPO
7B
•
Updated
Mar 17
•
16
•
2
ComparisonPO/Llama-3-Base-8B-DPO
8B
•
Updated
Mar 17
•
5
ComparisonPO/Llama-3-Instruct-8B-DPO_clean
8B
•
Updated
Mar 17
•
6
ComparisonPO/Llama-3-Instruct-8B-DPO
8B
•
Updated
Mar 17
•
8
ComparisonPO/Mistral-Base-7B-DPO_clean
7B
•
Updated
Mar 17
•
6
•
2
ComparisonPO/Mistral-Instruct-7B-DPO
7B
•
Updated
Mar 17
•
6
ComparisonPO/Mistral-Instruct-7B-DPO-ComPO
7B
•
Updated
Mar 17
•
8
•
1
ComparisonPO/Mistral-Base-7B-DPO-ComPO
7B
•
Updated
Mar 17
•
10
•
1
ComparisonPO/Mistral-Instruct-7B-DPO_clean
7B
•
Updated
Mar 17
•
8