33 40 101

geronimo

g-ronimo

https://medium.com/@geronimo7

geronimi73

AI & ML interests

fafo

Recent Activity

upvoted an article 10 days ago

Swift Transformers Reaches 1.0 – and Looks to the Future

liked a model 14 days ago

nvidia/llama-nemotron-embed-vl-1b-v2

upvoted an article 14 days ago

Small Yet Mighty: Improve Accuracy In Multimodal Search and Visual Document Retrieval with Llama Nemotron RAG Models

View all activity

Organizations

upvoted an article 10 days ago

Article

Swift Transformers Reaches 1.0 – and Looks to the Future

Sep 26, 2025

•

upvoted an article 14 days ago

Article

Small Yet Mighty: Improve Accuracy In Multimodal Search and Visual Document Retrieval with Llama Nemotron RAG Models

14 days ago

•

upvoted 4 articles about 2 months ago

Article

Continuous batching from first principles

Nov 25, 2025

•

306

Article

Text-to-image Architectural Experiments

Nov 13, 2025

•

Article

We’re open-sourcing our text-to-image model and the process behind it

Nov 12, 2025

•

Article

Diffusers welcomes FLUX-2

Nov 25, 2025

•

170

upvoted a collection 4 months ago

Granite Embedding Models

Collection

7 items • Updated Nov 17, 2025 • 30

upvoted a paper 6 months ago

MetaCLIP 2: A Worldwide Scaling Recipe

Paper • 2507.22062 • Published Jul 29, 2025 • 36

upvoted 2 articles 6 months ago

Article

Extending Transformer layers as Painters to DiT's

Aug 31, 2024

•

Article

LeRobot.js

Jul 14, 2025

•

upvoted an article 7 months ago

Article

Learn the Hugging Face Kernel Hub in 5 Minutes

Jun 12, 2025

•

151

upvoted 2 articles 8 months ago

Article

KV Cache from scratch in nanoVLM

Jun 4, 2025

•

110

Article

nanoVLM: The simplest repository to train your VLM in pure PyTorch

May 21, 2025

•

248

upvoted a paper 9 months ago

OpenVision: A Fully-Open, Cost-Effective Family of Advanced Vision Encoders for Multimodal Learning

Paper • 2505.04601 • Published May 7, 2025 • 29

upvoted a collection 9 months ago

Vision

Collection

163 items • Updated 14 days ago • 1

upvoted an article 9 months ago

Article

Remote VAEs for decoding with Inference Endpoints 🤗

Feb 24, 2025

•

upvoted a paper 10 months ago

SmolVLM: Redefining small and efficient multimodal models

Paper • 2504.05299 • Published Apr 7, 2025 • 204

upvoted an article 11 months ago

Article

SmolVLM2: Bringing Video Understanding to Every Device

Feb 20, 2025

•

322

upvoted a paper 12 months ago

SmolLM2: When Smol Goes Big -- Data-Centric Training of a Small Language Model

Paper • 2502.02737 • Published Feb 4, 2025 • 254

upvoted a paper about 1 year ago

Dolphin: Closed-loop Open-ended Auto-research through Thinking, Practice, and Feedback

Paper • 2501.03916 • Published Jan 7, 2025 • 16

geronimo

AI & ML interests

Recent Activity

Organizations

g-ronimo's activity

Swift Transformers Reaches 1.0 – and Looks to the Future

Small Yet Mighty: Improve Accuracy In Multimodal Search and Visual Document Retrieval with Llama Nemotron RAG Models

Continuous batching from first principles

Text-to-image Architectural Experiments

We’re open-sourcing our text-to-image model and the process behind it

Diffusers welcomes FLUX-2

Extending *Transformer layers as Painters* to DiT's

LeRobot.js

Learn the Hugging Face Kernel Hub in 5 Minutes

KV Cache from scratch in nanoVLM

nanoVLM: The simplest repository to train your VLM in pure PyTorch

Remote VAEs for decoding with Inference Endpoints 🤗

SmolVLM2: Bringing Video Understanding to Every Device

Extending Transformer layers as Painters to DiT's