BigScience Workshop

non-profit

https://bigscience.huggingface.co

bigscienceW

bigscience-workshop

Activity Feed

AI & ML interests

A one-year long research workshop on large language models: the Summer of Language Models 21 🌸

Recent Activity

ybelkada authored a paper 12 days ago

Learnable Multipliers: Freeing the Scale of Language Model Matrix Layers

christopher new activity about 1 month ago

bigscience/bloomz-560m:Fails to load with transformers v4.57+

christopher authored a paper about 2 months ago

Economies of Open Intelligence: Tracing Power & Participation in the Model Ecosystem

View all activity

ybelkada

authored a paper 12 days ago

Learnable Multipliers: Freeing the Scale of Language Model Matrix Layers

Paper • 2601.04890 • Published 13 days ago • 40

craffel

authored a paper 27 days ago

TokSuite: Measuring the Impact of Tokenizer Choice on Language Model Behavior

Paper • 2512.20757 • Published 29 days ago • 16

christopher

in bigscience/bloomz-560m about 1 month ago

Fails to load with transformers v4.57+

#14 opened about 1 month ago by

qgallouedec

christopher

in bigscience/petals-api 2 months ago

Bloom

#2 opened 2 months ago by

Raz-Test

rabiulawal

authored a paper 2 months ago

Grounding Computer Use Agents on Human Demonstrations

Paper • 2511.07332 • Published Nov 10, 2025 • 105

thomwolf

authored a paper 3 months ago

Robot Learning: A Tutorial

Paper • 2510.12403 • Published Oct 14, 2025 • 120

BramVanroy

posted an update 3 months ago

Post

448

What are currently the best multilingual models with at most 72B parameters? Are Llama 3.3 70B and Qwen 2.5 72B still king?

1 reply

davanstrien

posted an update 5 months ago

Post

1652

I fine-tuned a smol VLM to generate specialized art history metadata!

https://huggingface.co/davanstrien/iconclass-vlm: Qwen2.5-VL-3B trained using SFT to generate ICONCLASS codes (think Dewey Decimal for art!)

Trained with TRL + HF Jobs - single UV script, no GPU needed!

Space to explore predictions on a test set: davanstrien/iconclass-predictions

Blog soon!

christopher

in bigscience/bloom 5 months ago

Let's talk about the model

#284 opened 5 months ago by

kalashshah19

BramVanroy

posted an update 5 months ago

Post

916

Thanks to popular request, I've just added two subsets to the CommonCrawl-Creative Commons Corpus (C5; BramVanroy/CommonCrawl-CreativeCommons) so that you do not have to do filtering manually

- C5f ( BramVanroy/CommonCrawl-CreativeCommons-fine): only retains high-quality samples that are also present in FineWeb or FineWeb-2;
- C5r (https://huggingface.co/datasets/BramVanroy/CommonCrawl-CreativeCommons-recommended): additional strict filtering that removes samples with license disagreement, non-commercial licenses, and Wikipedia samples. The latter because you should probably get those from a more reliable source that provides better parsed content.

It goes without saying that these filters lead to a massive reduction in quantity. Doc and token counts are given on the dataset pages.