256 cores of AI power: this is the processor that translates videos in real time and powers chatbots

Deal Score0
Deal Score0

Here is a processor that you will never buy… but that you will undoubtedly use a little every day without knowing it. Unknown to the general public, Ampere Computing is an American company founded by a former senior executive at Intel, Renée James. Ampere's mission is to design and sell ARM CPUs for data centers and other supercomputers. And its new generation of chips, AmpereOne, has just been unveiled. A processor integrating no less than 256 CPU cores!

What does this have to do with your uses? As cloud-based AI assistants expand, industry players are looking to lower their usage bills, both for purchase and consumption. And in this game, Ampere's ARM-core CPUs have, on paper, many advantages.

Advertising, your content continues below

© Adrian Branco for Les Numériques

Let's talk about the chip first: if a version of the AmpereOne with 192 cores engraved in 5 nm is already on the market – and integrated in data centers like that of the French Scaleway (whose backstage we visited last December ) — the next bullet points will go further. Thanks to 3nm engraving, until now exclusive to Apple chips, Ampere will be able to increase the number of cores by 33%, to push up to 256 CPU cores.

But while Nvidia and other Intel are counting on increasingly energy-intensive chips – Intel is actively working on the dissipation of chips ranging from 1 kW to 2 kW, and Nvidia has made no secret of the fact that it is taking the same path – Ampere makes the strategic bet of seeking performance gains with constant dissipation. A gain that the company obtains by always adding more cores. Ever more efficient hearts.

Advertising, your content continues below

More and more cores with constant consumption

Jeff Wittich, Chief Product Officer at Ampere.

Jeff Wittich, Chief Product Officer at Ampere.

© Adrian Branco for Les Numériques

According to Ampere's roadmap, its best chip, AmpereOne, planned for 2025, therefore has 256 cores. But as Jeff Wittich, Chief Product Officer at Ampere, explains to us, this type of chip is the opposite of GPUs and other high-power accelerators. “The rate at which the data center industry's energy consumption is increasing is unsustainable“, he explains. “We cannot always consume more power, and we must optimize consumption as much as possible“, he continues.

Our strength in the market is that we have designed a chip architecture dedicated to cloud use. While our competitors have pre-cloud and pre-AI core designs, our CPU cores and chips are entirely designed for this need.“, he assures.

© Adrian Branco for Les Numériques

Ampere's first weapon is the impressive number of cores embedded in its chips. “Currently, our densest chip has 192 CPU cores. But from 2025, our AmpereOne engraved in 3 nm will offer no less than 256 cores“. And this is where Ampere's second weapon comes into play: the new chip will consume exactly like the current chip. That is to say “between 300 and 350 W“, explains J. Wittich. “We don't want to consume more energy, we always want to do more with the same power. And our architecture is extensible and particularly suited to AI“, assures the man. But wait a minute, AI isn't the domain of GPUs?

Advertising, your content continues below

85% of AI is not related to training

pedagogy Victor Jakubiuk, at AI at Ampere.

Victor Jakubiuk, Head of AI at Ampere.

© Adrian Branco for Les Numériques

Faced with Nvidia which now has a market capitalization of $3,000 billion, the AI ​​novice is right to wonder how a small player could shake up such a juggernaut? “It's not the same thing!“, explains Victor Jakubiuk, head of AI at Ampere, educationally.Powerful GPUs are there to train AIs. This is indeed intensive computing which requires high-power chips. But AI training only represents 15% of the calculations related to this field! However, the remaining 85% is inference, that is to say the use of these AIs. Because once a model is trained, which takes several weeks or months, that model is used en masse by millions of users. And that's where our processors come in“, explains the engineer.

Using lightweight AI models, a single Ampere processor core can transform an audio stream into real-time subtitles.

Using lightweight AI models, a single Ampere processor core can transform an audio stream into real-time subtitles.

© Adrian Branco for Les Numériques

Processors that are there to execute models in the most optimal way possible. And what models are these? Jeff Wittich answers us: “In addition to the classic cloud uses of CPUs such as databases such as MongoDB, our CPU cores are used for many of your daily uses. When you watch a video, our hearts are responsible for generating automatic subtitles, translating these subtitles. And when you use your banking app's chatbot, our CPUs may also be running it“, he explains.

Advertising, your content continues below

Real savings, and not just in AI

Damien Lucas, CEO of Scaleway.

Damien Lucas, CEO of Scaleway.

© Adrian Branco for Les Numériques

To verify the statements of the Ampere teams, we went back to see Damien Lucas, the CEO of Scaleway whom we met last year. The man is smiling, but he is also very direct and adept at zero bullshit : “The power savings of Ampere chips are real“, he affirms. If the path of ARM CPUs in data centers has not always been a long, quiet river, since Scaleway has “proposed then discontinued ARM a few years ago“, the instruction set is back in force “thanks to customer demand. On the one hand monopoly situations are never good for the market, but on the other hand Ampere's ARM chips allow significant energy savings“, he assures. In AI, it is about gains in inference”in the order of x3 to x5 compared to the Nvidia GPU, according to Ampere“.

An Ampere server from Scaleway.

An Ampere server from Scaleway.

© Adrian Branco for Les Numériques

Efficiency gains that are not just limited to the area of ​​AI inference. “If we put all the technologies and all the chips on the market at the service of our customers, since we work with everyone, we also have our own infrastructure at Scaleway. And the truth is that we switched all our home servers from x86 to Ampere ARM“, says Damien Lucas during the conversation. For what gains? “A reduction in energy bills of 30%“, he enthuses.

Advertising, your content continues below

But don't get carried away with the responsibility of cloud players: if those who have inference needs could turn massively to chips like those from Ampere, the energy-hungry monsters still have a lot of future. “In terms of training, at the moment it is a speed race that is underway. There are never enough GPUs available. We, at Scaleway, serve all types of customers. And we clearly see the double trend: on the one hand, calculation that I would describe as super-intensive and on the other, super-efficient calculation“. An efficiency which is Ampere's workhorse.

1000-core processors in 2030?

© Adrian Branco for Les Numériques

Between the Chinese threats hovering over Taiwan or the difficulty in ever reducing the fineness of engraving, the challenges surrounding improving the calculation capacities of processors are enormous. However, Jeff Wittich does not seem the least bit worried about future improvements in the performance per watt of his chips.

Even if manufacturing processes remain stuck at 3 nm for years, we could still do better even with our 350 W envelope“, he assures. Before agreeing to make a prediction. “Even with these constraints, in 2030 we could have a chip with 1000 cores. We still have a lot of room in our architecture“, he promises. Before concluding: “And that's good. Because in the world we live in, we no longer have the luxury of wasting energy“.

Advertising, your content continues below

More Info

We will be happy to hear your thoughts

Leave a reply

Bonplans French
Logo