@telcom on Hugging Face: "NVIDIA’s Groq deal ... I think, inference efficiency is becoming the main…"

Join the conversation

Join the community of Machine Learners and AI enthusiasts.

telcom

posted an update about 6 hours ago

Post

NVIDIA’s Groq deal ... I think, inference efficiency is becoming the main driver of profitability, and NVIDIA’s Groq deal is evidence the market is moving from “who can train biggest” to “who can serve cheapest and fastest at scale.” That points to a maturing phase of AI, not necessarily the end of a bubble, but definitely a correction in what “wins” long-term.
What do you think?

JLouisBiz

about 2 hours ago

Individual users win only if they can get it cheaper, faster, more free as in software freedom, to run LLM models on their own hardware. Otherwise, those mega-stories are of no use.

telcom

about 2 hours ago

I agree; currently hosting a custom model is not viable either up front hardware or serves both expensive. Who can host open source models cheaper will get the public attention. As a start up or individual , you can invest in fine tuning and training with high end hardware to make a specialised model get the weight and pipeline perfect but if you want to host it and respond to the scale of the demand or availability of the service it’s difficult not sustainable.

In this post