We tried Pixtral, the new model from Mistral AI and it's impressive
Mistral AI announces the availability for all of Le Chat, its AI-boosted chatbot. While it was previously necessary to download nearly 24 GB of data via torrent to use Pixtral, the firm has just compiled them into its web service.
The first multimodal AI from Mistral AI in web version
Similar to ChatGPT or Google Gemini, Le Chat is available in a web version. Until now, it allowed dialogue with the AI language models Mistral Nemo, Codestral and Mistral Large 2. This new version allows free access to Pixtral 12-B, Mistral's first multimodal AI model.
In other words, a multimodal language implies its ability to process various data formats: for this language, this is the ability to analyze texts and images.
According to benchmarks published by Mistral AI, the startup prides itself on matching and sometimes even surpassing some larger models, such as the LLaVA-OV 7B.
Pixtral Trial: Generate HTML from a Sketch
We wanted to check out the capabilities of Pixtral 12-B as highlighted by Mistral AI. The company claims that its language is capable of generating computer code from a hand-drawn diagram. So we sketched out a web page on an iPad using the app Procreate with a Apple Pencil.
By sending this image to Pixtral, we associated this prompt with it: “Write HTML code to create a site like this.” The Chat runs and generates a source code in HTML format.
Curious, we rushed to view the HTML code in browser version.
Although the result may seem slightly sketchy, the optical recognition of handwriting is very effective. The layout is generally respected, with the exception of the news slots which are not positioned as in the sketch.