
NVIDIA: Faced with Goliath of $ 4,000 billion, the resistance is organized in Paris
The world was gathered this July 8 and 9 at the prestigious carousel of the Louvre in Paris for the second edition of the Raise Summit show. A high mass that has seen the big names in the segment. But with an absent: the guti of the Nvidia GPU. While it was he who led the dance during the Computex of Taipei and Vivatech in Paris in May and June, the Titan of the fleas had no stand on this conquered land.
The main conference room of the Raise Summit de Paris. © Adrian Branco for digital
Advertisement
Conquered, really? If that and there some “partner of Nvidia” signs appeared, the bulk of the companies present during the Raise Summit 2025 in Paris was software players, data centers, models, etc. Who all use, at different levels, the powerful GPUs of the American Titan.
But in this horde of Software actors de facto propelled by Nvidia, hardware resistance was present. Because if Nvidia is indeed the King of the Training (King of training) of the AI, he could go differently from the execution of the AI. A task called inference and which already has its champions, large and small.
Inference or motorization of AI
© Adrian Branco for digital
In terms of IA training, Nvidia folded the game as they say. Its GPUS are the most powerful, its infiniband network is the fastest, its software is the most complete, the whole forming the trio most likely to go up in volume (Scalabibility in the jargon). And the only one able to train the models several hundred billion parameters.
Advertisement
But it is an area in which the match is not played: that of the execution of AI, which is called “inference”. Indeed, the training phase spent after weeks (or months!) Intensive calculation, the result becomes a model. A large program that must be executed. And while training is increasingly complex and energy -consuming, the execution process of these AI (inference therefore) seeks to be the cheapest and the least energy -consuming possible. And that’s where resistance tries its luck.
If you immediately think of AMD and Intel, you are halfway. Yes, these two large actors of semiconductors develop software fleas and stacks. Yes, they have competing solutions and their CPUs as their GPUs make inference, in our PCs – because yes, PCs perform small models – but also in the cloud. But their CPU and GPU architectures are very close to what Nvidia offers. And their success is widely hampered by the latter.
Sambanova, all-in-one American champion
Rodrigo Liang, co-founder and CEO of Sambanova. © Adrian Branco for digital
Among the rising startups, there is the American Sambanova. Do not look in your tablets, you have never used their machines directly. “We design systems to pilot companies’ AI,” the CEO and founder of Sambanova, Rodrigo Liang, told digital.
“Our specialty is to propel private AI of companies. For many companies, it is not conceivable to use Chatgpt with requests or confidential documents,” explains Rodrigo Liang. Whether it’s models enriched with private data (we talk about RAG) or simply large general models, “we give companies the tools so as not to show the public confidential data”.
Advertisement
But why do he deviable do not use Nvidia chips in this case? “When you run your models on your machines, one thing becomes very important: the cost. A cost that is linked to machines, buildings and energy consumed. On a model like Deepseek, we are much more effective than Nvidia inference solutions,” says Liang. Who illustrates as follows: “An NVIDIA GB200 NVL72 rack consumes 140 kW! With our racks equipped with our SN40L RDU chips, we offer the same performance in inference for only 10 kW”, assures man. Pointing here the energy savings, but also of space of data centers saved.
But the chip is not everything. Far, very far from there. What matters today is the software, and Sambanova claims to be very competitive here. Titt on the software domination of the Cuda ecosystem, the CEO answers everything from Go: “We do not need Cuda. In January 2025 last, when Deepseek arrived, nobody knew this Chinese model. In two days, our engineers adapted the software so that our customers can take advantage of it. And that only took us one day to adapt the Llama 4 model. Pytorch, “says Liang.
Furiosa ai, the Korean dwarf
© Adrian Branco for digital
If the American Sambanova is already a powerful and established company, with valuation at more than $ 1.1 billion, the subject of inference is so key that many other actors are on the niche. Other big Americans, such as Groq and its almost three billion market capitalization. But also small thumbs not from the country of Uncle Sam.
Advertisement
Like the very astonishing team of Furiosa Ai. A small Korean actor who made the headlines of the economic newspapers in the segment for having said “no” to a check for $ 810 million from the Titan Meta (Facebook, Whatsapp, Instagram, etc.). Questioned by us on this funny refusal in March 2025 last: “The reason is that we think that we can earn much more than that”, assured the teams on site at the Raise Summit.
Teams that highlighted the difference compared to Nvidia or AMD. “Their accelerators remain designed on the initial concepts of GPUs, graphic chips. Our chip is not a GPU, but a Tensor Controlment Processor (TCP). A programmable architecture specially designed for inference”, we are told.
© Adrian Branco for digital
Less advanced than the big American actors, Furiosa Ai is however at its second generation of chip called RNGD. A chip which, like that of Sambanova, highlights its greatest energy sobriety, with only 180 W on the clock. Far, very far from the H200 of Nvidia which can swallow up to 700 W at point. A TDP certainly justified by power, but which involves very expensive cooling solutions.
If the teams of Furiosa Ai also argue of the ease of use of their chip – but you have to be wary of promises in front of an omnipotent Cuda -, another asset of this company is its … nationality. The company is indeed Korean, like its founder Junho Baek, a former from AMD and Samsung Electronics. This offers him an importance of important: access to HBM memory, an ultra fast memory, but difficult to saw up as demand is great.
Advertisement
We have indeed seen many companies in the Computex, Taiwanese and Chinese, to confide that their accelerators must do without this memory due to shortage. And like all HBM memory producers are South Korean, you understand why this small business has a sacred asset in its pocket linked to its passport!
Inference will be everywhere and the competition is fierce
From Sambanova to Groq via Furiosa Ai, the giant Cerebras chip or the innovative chips of Trenstorrent, the world of accelerators IA is in turmoil around inference. “Because if all the models are well trained in Nvidia GPUs, the reality is that 95 % of the AI market will be centered around inference,” said Rodrigo Liang de Sambanova.
A specialist who professes that “in the long term, there will be thousands of very small data centers close to the dwellings that will make the link between large data centers and users. This will allow you to offer powerful AIs with very low latency. Here too, it will be an inference. And in this area, there is absolutely no need for Cuda”. A man who does not seem to be afraid to face the first business in history to weigh more than $ 4,000 billion …
Advertisement
Want to save even more? Discover Our promo codes Selected for you.




