hugging-science (Hugging Science)

VibecoderMcSwaggins

updated a dataset 4 days ago

hugging-science/arc-aphasia-bids

Updated 4 days ago • 294 • 3

ArkaMukherjee

authored 2 papers 6 days ago

Toward Socially Aware Vision-Language Models: Evaluating Cultural Competence Through Multimodal Story Generation

Paper • 2508.16762 • Published Aug 22

mmJEE-Eval: A Bilingual Multimodal Benchmark for Evaluating Scientific Reasoning in Vision-Language Models

Paper • 2511.09339 • Published 25 days ago

reach-vb

authored a paper 10 days ago

Open ASR Leaderboard: Towards Reproducible and Transparent Multilingual and Long-Form Speech Recognition Evaluation

Paper • 2510.06961 • Published Oct 8 • 9

SelmaNajih001

posted an update about 1 month ago

Post

2831

How Financial News Can Be Used to Train Good Financial Models 📰
Numbers tell you what happened, but news tells you why.
I’ve written an article explaining how news can be used to train AI models for sentiment analysis and better forecasting. Hope you find it interesting!

Read it here: https://huggingface.co/blog/SelmaNajih001/llms-applied-to-finance

I would love to read your opinions! I’m open to suggestions on how to improve the methodology and the training

1 reply

·

SelmaNajih001

posted an update about 1 month ago

Post

3022

Which is the best model to use as a signal for investment?
Here who is gaining the most:
SelmaNajih001/InvestmentStrategyBasedOnSentiment

The Space uses titles from this dataset:
📊 SelmaNajih001/Cnbc_MultiCompany

Given a news title, it calculates a sentiment score : if the score crosses a certain threshold, the strategy decides to buy or sell.
Each trade lasts one day, and the strategy then computes the daily return.
For Tesla the best model seems to be the regression 👀
Just a quick note: the model uses the closing price as the buy price, meaning it already reflects the impact of the news.

SelmaNajih001

posted an update about 2 months ago

Post

679

How Financial News Can Be Used to Train Good Financial Models 📰
Numbers tell you what happened, but news tells you why.
I’ve written an article explaining how news can be used to train AI models for sentiment analysis and better forecasting. Hope you find it interesting!

Read it here: https://huggingface.co/blog/SelmaNajih001/llms-applied-to-finance

I would love to read your opinions! I’m open to suggestions on how to improve the methodology and the training

1 reply

·

SelmaNajih001

posted an update about 2 months ago

Post

385

Which is the best model to use as a signal for investment? 🤔
I’ve created a Space where you can compare three models:
-Two available on my profile
- ProsusAI/finbert
You can try it here:
👉 SelmaNajih001/InvestmentStrategyBasedOnSentiment
The Space uses titles from this dataset:
📊 SelmaNajih001/Cnbc_MultiCompany

Given a news title, it calculates a sentiment score : if the score crosses a certain threshold, the strategy decides to buy or sell.
Each trade lasts one day, and the strategy then computes the daily return.

Just a quick note: the model uses the closing price as the buy price, meaning it already reflects the impact of the news.
If I had chosen the opening price, the results would have been less biased but less realistic given the data available.

SelmaNajih001

posted an update 2 months ago

Post

2506

I found it excellent and very well done.
One of the best explanations of embedding I've ever read. Well done, @hesamation !
Had to share this: hesamation/primer-llm-embedding

SelmaNajih001

posted an update 2 months ago

Post

2286

Finally, I uploaded the model I developed for my master’s thesis! Given a financial event, it provides explained predictions based on a dataset of past news and central bank speeches.
Try it out here:
SelmaNajih001/StockPredictionExplanation
(Just restart the space and wait a minute)

The dataset used for RAG can be found here:
SelmaNajih001/FinancialNewsAndCentralBanksSpeeches-Summary-Rag
While the dataset used for the training is:
SelmaNajih001/FinancialClassification

I also wrote an article to explain how I've done the training. You can find it here https://huggingface.co/blog/SelmaNajih001/explainable-financial-predictions

2 replies

·

SelmaNajih001

posted an update 2 months ago

Post

3391

Introducing a Hugging Face Tutorial on Regression

While Hugging Face offers extensive tutorials on classification and NLP tasks, there is very little guidance on performing regression tasks with Transformers.
In my latest article, I provide a step-by-step guide to running regression using Hugging Face, applying it to financial news data to predict stock returns.
In this tutorial, you will learn how to:
-Prepare and preprocess textual and numerical data for regression
-Configure a Transformer model for regression tasks
-Apply the model to real-world financial datasets with fully reproducible code

Read the full article here: https://huggingface.co/blog/SelmaNajih001/how-to-run-a-regression-using-hugging-face
The dataset used: SelmaNajih001/FinancialClassification

1 reply

·

SelmaNajih001

posted an update 2 months ago

Post

1654

Introducing SelmaNajih001/StockPredictionExplanation, built with GRPO and RAG:
-GRPO trains the model to predict and explain stock direction.
-RAG grounds explanations in historical financial news and central bank speeches.
Together, they create a system that forecasts stock movements and shows the reasoning behind them.
Full article: Explainable Financial Predictions — https://huggingface.co/blog/SelmaNajih001/explainable-financial-predictions
Try it here: StockPredictionExplanation Space — SelmaNajih001/StockPredictionExplanation

SelmaNajih001

posted an update 2 months ago

Post

265

Predicting Stock Price Movements from News 📰📈
I trained a model to predict stock price movements (Up, Down, Neutral) from company news.
Dataset: Articles linked to next-day price changes, covering Apple, Tesla, and more.
Approach: Fine-tuned allenai/longformer-base-4096 for classification.
Outcome: The model captures the link between news and stock movements, handling long articles and producing probability scores for each label.
Comparison: Shows promising alignment with stock trends, sometimes outperforming FinBERT.
Feel free to try the model and explore how news can influence stock predictions SelmaNajih001/SentimentAnalysis

reach-vb

posted an update 6 months ago

Post

5968

Excited to onboard FeatherlessAI on Hugging Face as an Inference Provider - they bring a fleet of 6,700+ LLMs on-demand on the Hugging Face Hub 🤯

Starting today, you'd be able to access all those LLMs (OpenAI compatible) on HF model pages and via OpenAI client libraries too! 💥

Go, play with it today: https://huggingface.co/blog/inference-providers-featherless

P.S. They're also bringing on more GPUs to support all your concurrent requests!

1 reply

·

reach-vb

posted an update 7 months ago

Post

4642

hey hey @mradermacher - VB from Hugging Face here, we'd love to onboard you over to our optimised xet backend! 💥

as you know we're in the process of upgrading our storage backend to xet (which helps us scale and offer blazingly fast upload/ download speeds too): https://huggingface.co/blog/xet-on-the-hub and now that we are certain that the backend can scale with even big models like Llama 4/ Qwen 3 - we;re moving to the next phase of inviting impactful orgs and users on the hub over as you are a big part of the open source ML community - we would love to onboard you next and create some excitement about it in the community too!

in terms of actual steps - it should be as simple as one of the org admins to join hf.co/join/xet - we'll take care of the rest.

p.s. you'd need to have a the latest hf_xet version of huggingface_hub lib but everything else should be the same: https://huggingface.co/docs/hub/storage-backends#using-xet-storage

p.p.s. this is fully backwards compatible so everything will work as it should! 🤗

16 replies

·

reach-vb

authored a paper 8 months ago

SmolVLM: Redefining small and efficient multimodal models

Paper • 2504.05299 • Published Apr 7 • 200

mmhamdy

posted an update 8 months ago

Post

1713

What inspired the Transformer architecture in the "Attention Is All You Need" paper? And how were various ideas combined to create this groundbreaking model?

In this lengthy article, I explore the story and the origins of some of the ideas introduced in the paper. We'll explore everything from the fundamental attention mechanism that lies at its heart to the surprisingly simple explanation for its name, Transformer.

💡 Examples of ideas explored in the article:

✅ What was the inspiration for the attention mechanism?
✅ How did we go from attention to self-attention?
✅ Did the team have any other names in mind for the model?

and more...

I aim to tell the story of Transformers as I would have wanted to read it, and hopefully, one that appeals to others interested in the details of this fascinating idea. This narrative draws from video interviews, lectures, articles, tweets/Xs, and some digging into the literature. I have done my best to be accurate, but errors are possible. If you find inaccuracies or have any additions, please do reach out, and I will gladly make the necessary updates.

Read the article: https://huggingface.co/blog/mmhamdy/pandemonium-the-transformers-story

shwetangshub

authored 2 papers 9 months ago

The Brittleness of AI-Generated Image Watermarking Techniques: Examining Their Robustness Against Visual Paraphrasing Attacks

Paper • 2408.10446 • Published Aug 19, 2024 • 9

Visual Counter Turing Test (VCT^2): Discovering the Challenges for AI-Generated Image Detection and Introducing Visual AI Index (V_AI)

Paper • 2411.16754 • Published Nov 24, 2024 • 4

mmhamdy

posted an update 10 months ago

Post

2774

🎉 We're excited to introduce MemoryCode, a novel synthetic dataset designed to rigorously evaluate LLMs' ability to track and execute coding instructions across multiple sessions. MemoryCode simulates realistic workplace scenarios where a mentee (the LLM) receives coding instructions from a mentor amidst a stream of both relevant and irrelevant information.

💡 But what makes MemoryCode unique?! The combination of the following:

✅ Multi-Session Dialogue Histories: MemoryCode consists of chronological sequences of dialogues between a mentor and a mentee, mirroring real-world interactions between coworkers.

✅ Interspersed Irrelevant Information: Critical instructions are deliberately interspersed with unrelated content, replicating the information overload common in office environments.

✅ Instruction Updates: Coding rules and conventions can be updated multiple times throughout the dialogue history, requiring LLMs to track and apply the most recent information.

✅ Prospective Memory: Unlike previous datasets that cue information retrieval, MemoryCode requires LLMs to spontaneously recall and apply relevant instructions without explicit prompts.

✅ Practical Task Execution: LLMs are evaluated on their ability to use the retrieved information to perform practical coding tasks, bridging the gap between information recall and real-world application.

📌 Our Findings

1️⃣ While even small models can handle isolated coding instructions, the performance of top-tier models like GPT-4o dramatically deteriorates when instructions are spread across multiple sessions.

2️⃣ This performance drop isn't simply due to the length of the context. Our analysis indicates that LLMs struggle to reason compositionally over sequences of instructions and updates. They have difficulty keeping track of which instructions are current and how to apply them.

🔗 Paper: From Tools to Teammates: Evaluating LLMs in Multi-Session Coding Interactions (2502.13791)
📦 Code: https://github.com/for-ai/MemoryCode

Hugging Science

AI & ML interests

Recent Activity

Articles

SARLO-80: Worldwide Slant SAR Language Optic Dataset at 80 cm Resolution

The Pharmome Map: a comprehensive public dataset for drug-target interaction modeling

Advancing Predictive ADMET Modeling Through Community-Driven Science: The ExpansionRx-OpenADMET Blind Challenge

Promoter-GPT: Writing DNA Instructions with Language Models

AI for Food Allergies

hugging-science/arc-aphasia-bids

Toward Socially Aware Vision-Language Models: Evaluating Cultural Competence Through Multimodal Story Generation

mmJEE-Eval: A Bilingual Multimodal Benchmark for Evaluating Scientific Reasoning in Vision-Language Models

Open ASR Leaderboard: Towards Reproducible and Transparent Multilingual and Long-Form Speech Recognition Evaluation

SmolVLM: Redefining small and efficient multimodal models

The Brittleness of AI-Generated Image Watermarking Techniques: Examining Their Robustness Against Visual Paraphrasing Attacks

Visual Counter Turing Test (VCT^2): Discovering the Challenges for AI-Generated Image Detection and Introducing Visual AI Index (V_AI)

AI & ML interests

Recent Activity

Articles

SARLO-80: Worldwide Slant SAR Language Optic Dataset at 80 cm Resolution

The Pharmome Map: a comprehensive public dataset for drug-target interaction modeling

Advancing Predictive ADMET Modeling Through Community-Driven Science: The ExpansionRx-OpenADMET Blind Challenge

Promoter-GPT: Writing DNA Instructions with Language Models

AI for Food Allergies

Team members 786

hugging-science's activity