view post Post 2548 Hey everyone! Just wanted to share this awesome dataset that features over 1 million tokens specifically for the Egyptian dialect. Check it HeshamHaroon/1milion_token_EGY_songs
Fav dataset HuggingFaceFW/fineweb Viewer • Updated Jul 11, 2025 • 52.5B • 200k • 2.61k HeshamHaroon/ArzEn-MultiGenre Viewer • Updated Dec 31, 2023 • 26k • 947 • 12 gretelai/synthetic_text_to_sql Viewer • Updated Dec 16, 2025 • 106k • 2.46k • 624 Anthropic/persuasion Viewer • Updated Apr 9, 2024 • 3.94k • 175 • 199
My favorite models meta-llama/Meta-Llama-3-8B Text Generation • 8B • Updated Sep 27, 2024 • 1.54M • • 6.44k meta-llama/Meta-Llama-3-8B-Instruct Text Generation • 8B • Updated Jun 18, 2025 • 1.48M • • 4.36k ai21labs/Jamba-v0.1 Text Generation • 52B • Updated Sep 11, 2024 • 751 • 1.19k CohereLabs/c4ai-command-r-plus Text Generation • 104B • Updated Apr 16, 2025 • 2k • 1.77k
Fav dataset HuggingFaceFW/fineweb Viewer • Updated Jul 11, 2025 • 52.5B • 200k • 2.61k HeshamHaroon/ArzEn-MultiGenre Viewer • Updated Dec 31, 2023 • 26k • 947 • 12 gretelai/synthetic_text_to_sql Viewer • Updated Dec 16, 2025 • 106k • 2.46k • 624 Anthropic/persuasion Viewer • Updated Apr 9, 2024 • 3.94k • 175 • 199
My favorite models meta-llama/Meta-Llama-3-8B Text Generation • 8B • Updated Sep 27, 2024 • 1.54M • • 6.44k meta-llama/Meta-Llama-3-8B-Instruct Text Generation • 8B • Updated Jun 18, 2025 • 1.48M • • 4.36k ai21labs/Jamba-v0.1 Text Generation • 52B • Updated Sep 11, 2024 • 751 • 1.19k CohereLabs/c4ai-command-r-plus Text Generation • 104B • Updated Apr 16, 2025 • 2k • 1.77k