Hugging Face – Posts

Join the conversation

Join the community of Machine Learners and AI enthusiasts.

All HF Hub posts

Ujjwal-Tyagi

posted an update 1 day ago

Post

1848

So, Koreans are also doing great progress behind Chinese,
Their two open source ai models that are actually good in coding. upstage/Solar-Open-100B skt/A.X-K1

marksverdhei

posted an update about 21 hours ago

Post

987

Inspired by the heroes of day zero quants ( @TheBloke @danielhanchen @shimmyshimmer @bartowski ), I decided to join the race by releasing the first FP8 quant of glm-4.7-flash! Not as easy as i expected, but I'm happy i was still able to have it working within a few hours after the original model was released! Interested in feedback if anyone wants to try it out!

marksverdhei/GLM-4.7-Flash-FP8

Note: If my PR to vLLM isn't merged yet you might have to use my fork. Cheers! 🤗

DawnC

posted an update 1 day ago

Post

2358

VividFlow: Complete AI Image Transformation Platform 🎬🎨✨
Three powerful creative tools in one streamlined workspace. VividFlow combines professional video generation, intelligent background replacement, and artistic style transfer to transform your images with precision and creativity.

🎭 Triple Creative Powers
- Cinematic Video Generation transforms static images into smooth motion sequences from 0.5 to 5 seconds. Eight curated motion categories cover portraits, products, landscapes, and artistic content with precision-tuned templates.

- Intelligent Background Replacement generates photorealistic scenes from 24 professionally crafted presets spanning studios, natural environments, urban settings, and seasonal atmospheres. Advanced edge refinement handles complex subjects, while the built-in Touch Up tool eliminates artifacts through AI-powered inpainting for flawless results.

- Artistic Style Transfer converts photographs into stunning interpretations across six distinct styles including 3D Cartoon, Anime, Watercolor, and Oil Painting. Five balanced style blends create unique hybrid aesthetics, with optional Face Restore preserving subject identity during transformation.

⚡ Optimized Performance
Video generation completes in approximately 4 minutes with ongoing optimization targeting sub-60-second processing. Background replacement finishes in 30-40 seconds, while style transfer delivers results in 20-30 seconds. The independent three-tab architecture ensures smooth workflow without performance conflicts.

🎯 Professional Control
Seed-based reproducibility guarantees consistent results across all features. Background generation offers flexible composition modes, adjustable edge softening, and instant mask preview. Comprehensive parameter controls enable precise creative direction.

👉 Try it now: DawnC/VividFlow

Support with a ❤️ — your engagement drives priorities!
#AI #DeepLearning #ImageToVideo #StyleTransfer #CreativeAI

ZennyKenny

posted an update 2 days ago

Post

3084

😎 My new personal website is live! Check out https://kennethhamilton.me to chat with an LLM about my professional skills and personal projects.

🙈 Think of it like a really, really vain version of ChatGPT.

6 replies

sagar007

posted an update 3 days ago

Post

4051

🚀 I built a Multimodal Vision-Language Model from using Gemma-270M + CLIP!

Just finished training my multimodal model on the full LLaVA-Instruct-150K dataset (157K samples) and wanted to share the results!

🔧 What I Built:
A vision-language model that can understand images and answer questions about them, combining:
- Google Gemma-3-270M (language)
- OpenAI CLIP ViT-Large/14 (vision)
- LoRA fine-tuning for efficiency

📊 Training Stats:
- 157,712 training samples (full LLaVA dataset)
- 3 epochs on A100 40GB
- ~9 hours training time
- Final loss: 1.333 training / 1.430 validation
- Only 18.6M trainable params (3.4% of 539M total)

📈 sagar007/multigemma
Benchmark Results:
- VQA Accuracy: 53.8%
- Works great for: animal detection, room identification, scene understanding

🔗 **Try it yourself:**
- 🤗 Model: sagar007/multigemma
- 🎮 Demo: https://huggingface.co/spaces/sagar007/Multimodal-Gemma
- 💻 GitHub: https://github.com/sagar431/multimodal-gemma-270m

Built with PyTorch Lightning + MLflow for experiment tracking. Full MLOps pipeline with CI/CD!

Would love to hear your feedback! 🙏

#multimodal #gemma #clip #llava #vision-language #pytorch

9 replies

danielhanchen

posted an update about 24 hours ago

Post

1099

Run GLM-4.7-Flash locally on your device with 24GB RAM!🔥

It's the best performing 30B model on SWE-Bench and GPQA. With 200K context, it excels at coding, agents, chat & reasoning.

GGUF: unsloth/GLM-4.7-Flash-GGUF

Guide: https://unsloth.ai/docs/models/glm-4.7-flash

efecelik

posted an update 1 day ago

Post

973

Interesting paper: PhysRVG

The core idea: instead of treating physics as a soft condition the model can work around during optimization, enforce it strictly via reinforcement learning. The paper focuses on rigid body dynamics - collisions, pendulums, free fall, rolling.

PhysRVG: Physics-Aware Unified Reinforcement Learning for Video Generative Models (2601.11087)

2 replies

MonsterMMORPG

posted an update 3 days ago

Post

3867

Compared Quality and Speed Difference (with CUDA 13 & Sage Attention) of BF16 vs GGUF Q8 vs FP8 Scaled vs NVFP4 for Z Image Turbo, FLUX Dev, FLUX SRPO, FLUX Kontext, FLUX 2 - Full 4K step by step tutorial also published

Full 4K tutorial : https://youtu.be/XDzspWgnzxI

Check above full 4K tutorial to learn more and see uncompressed original quality and size images

It was always wondered how much quality and speed difference exists between BF16, GGUF, FP8 Scaled and NVFP4 precisions. In this tutorial I have compared all these precision and quantization variants for both speed and quality. The results are pretty surprising. Moreover, we have developed and published NVFP4 model quant generator app and FP8 Scaled quant generator apps. The links of the apps are below if you want to use them. Furthermore, upgrading ComfyUI to CUDA 13 with properly compiled libraries is now very much recommended. We have observed some noticeable performance gains with CUDA 13. So for both SwarmUI and ComfyUI solo users, CUDA 13 ComfyUI is now recommended.

5 replies

ZomiLanguage

posted an update 1 day ago

Post

815

🧠🌍 Zomi Language AI — Community-Driven, Open-Source

![Zomi Language AI – From Community to Model]

The **Zomi language** carries identity, faith, and history for its people, yet it remains underrepresented in modern AI systems.

This project introduces a **community-driven, open-source AI translation framework** that enables Zomi to be trained into AI systems **ethically, transparently, and sustainably**—by native speakers, for future generations.

### 🔁 How It Works
🧑‍🤝‍🧑 Community Texts → 📦 Open Datasets → 🤖 AI Training → 📊 Evaluation → 🔁 Community Review

### 🔓 Why Open-Source Matters
- 🤝 Community ownership
- 🕊️ Cultural & faith integrity
- ♻️ Long-term sustainability
- 🔍 Transparent datasets & models

This initiative demonstrates how **low-resource languages can shape the future of inclusive AI** through open collaboration.

> *No language should be digitally invisible.*

**@Zomi Language | fb.com/ZomiLanguage**

### 🏷️ Tags
#OpenSourceAI #LowResourceLanguages #NLP #MachineTranslation #LanguagePreservation #CommunityAI #ZomiLanguage

1 reply

ovi054

posted an update 1 day ago

Post

981

My project, Anim-Lab-AI, won the Community Choice Award at the MCP-1st-Birthday hackathon by @HuggingFace and @Gradio ! 🏆

It turns any idea or complex concept into a clear, engaging explainer animation video. 🎥

I want to thank everyone in the Hugging Face community for supporting my project!

MCP-1st-Birthday/anim-lab-ai

2 replies

Recently active users